Tuesday, September 25, 2012

A C64 in hindsight (or: if they knew then...)

If they knew then what they know now, I think the C64 could have had some better-thought-out components.  For example:

* The power supply could have had a universal power cord, so that peripherals and the CPU could use the same unit.  It might even have had two outlets so a single supply could power (for example) the C64 and a disk drive.  This would have decreased the price and complexity of the disk drive a bit, and increased its life span, as well.

* The 1541 could have had a logical track and sector layout, regardless of the physical layout, to simplify I/O.

* The 1541 layout would have put the image metadata in the header at location $10-$27, instead of $90-$A7, and started the BAM at $28, allowing contiguous BAM entries thru Track 40 (and theoretically thru Track 53).

* They could have spent just a little more time to fix the IEEE chip, to bring its speed back up to normal.

Monday, August 13, 2012

CargoCult in a Nutshell

Here's my working syntax for CargoCult.

assumptions

numerical expressions are per C standards.


variable declarations:

my [<type>] <id> [= <initialization expression>];

Array variables start with the sigil '@'.  They're indexed with square brackets, as in C.


function declarations:

fn <returntype> <name> parm1, parm2, ...

<body>
endfn


function calls:

[<return value> =] <function name([<parameters>])>;


function parameters are comma-separated, and typed or untyped.  If typed, the type precedes the identifier, e.g. callMyFunction( String foo, int bar );


for loops (currently only increment, by 1):

for |<indexname>| <start>..<end>

<body>
endfor


if statements:

if ( <expression> )
<body>
endif


return statements:

return <expression>;


Wednesday, August 8, 2012

CargoCult as an Intermediate Language

I face the onerous task of converting my commodore image reading code from AS3 into Perl and Objective-C.

Rather than port code twice to two platforms, I'd rather use CargoCult as the specification, and use real languages as targets.

I don't have to get 100% code conversion: I just need to get 80% of the way there to make this worthwhile.

That means CargoCult is a high-level Intermediate Language of sorts.  It's C-like, but uses syntactic sugar in a way that makes it relatively easy to write generators to transform it to other languages.  My goal is to be able to make line-by-line translators without having to do any real analysis of the code.

Here's a sample of CargoCult 1.0.

fn int buildZones totalSectors, startTrack, @zones

   for |index| 0..@zones.length
  
      my track = 1 + GLOBAL.totalTracks + startTrack;
      my sectorCount = @zones[index][1];
      my endTrack = 1 + GLOBAL.totalTracks + @zones[index][0];
     
      GLOBAL.totalTracks += @zones[index][0];
     
      for |jdex| track..endTrack
     
         GLOBAL.@trackOffset[ jdex ] = totalSectors * 0x100;
         GLOBAL.@sectorOffset[ jdex ] = totalSectors;
         GLOBAL.@sectorsInTrack[ jdex ] = sectorCount;
        
         totalSectors += sectorCount;
     
      endfor
     
   endfor
  
   return totalSectors;
  
endfn


I've successfully translated this into fully functional Perl, ActionScript3, and Objective-C.  It took 80 lines of code for each, but after that the translator was able to translate another CargoCult function, as well.

What I want to do next is build up a set of translations for each target language, for each transformation needed (line preprocessing, library call handing, subroutine handling, loop handling, and line postprocessing).

Thursday, June 28, 2012

CN - "C Data Notation"

Just for fun, here's an idea for a data notation that's sort of C-like in structure.  Does it have enough representational power to be useful?

CN  "C Data Notation"

A CN document is a list of items, separated by one or more newlines, like so:

list item one
list item two

list item three

---

CN documents are delimited by the triple-dash made popular by YAML:

---

Multi-line items use the C-style backslash \
at the end of the line.

---

Pairs are specified using the colon, like so:

key1: value1
key2: value2
key3: value3

Terminating colons do not represent a pair.

A group can be created using curly braces.

{
   key1: value1
   key2: value2
   key3: value3
   key4: value4\
         line2\
         line3\
         last line
        
}

It's not a hash, it's a key-value list, where the pairs are stored in order.

myLevel1:
{
   myLevel2:
   {
      subkey: 123
   }
}


A value from a pair can be referenced with an asterisk and the key's qualified namespace:

*myLevel1.myLevel2.subkey

Friday, June 1, 2012

Traveller5 Kickstarter project announced

For the new hardcover Traveller rulebook.

http://www.kickstarter.com/projects/traveller5/traveller-5th-edition

Wednesday, April 4, 2012

The D40 Commodore Image format

The D40 is an exercise in using my flexible image configuration.  Its goal is to design the largest image possible, which still uses the header block (and only the header block) to store BAM entries.

In other words, this format uses the header block as its primary design limitation.

Header

The most efficient existing Commodore header is used by the D81.  It makes a great place to start out.

The D81's header starts out the same as all others: a two-byte pointer to the first directory sector, then the DOS type, then one byte with $00, for a total of 4 bytes.

The label offset is at 0x04.  All Commodore labels consist of a 16-byte label, two $A0 bytes, two bytes for the Disk ID, one more $A0, two bytes for the DOS Type, and a final $A0 byte, for a total of 24 bytes.

That leaves (256 - 28) = 228 bytes for the BAM.  Now the fun begins.

BAM

The Block Allocation Map (BAM) consists of an array of records, one record for each track on the disk.  The first byte of each record is the "sectors free" count for that track.  The remaining data is a bitmap of the sector allocation for that track: a "1" means that sector is used, while a "0" means the sector is unallocated and free for use.

A little calculation will find several schemes which fits into 228 bytes.  Larger bitmaps tend to be more efficient with the space available.  The layout I select is 25 tracks of 64 sectors each.  The number of bytes needed for the BAM is (25 x (1+64/8)) = 25 x 9 = 225 bytes.

The total capacity of this image would be (25 x 64) blocks = 1600 blocks, or 400k.

Layout

I like to see the header at track 1 -- it's easier for a programmer to get at than at midpoint.  The directory can have the remaining 63 sectors in track 1, for a maximum of 63 * 8 = 504 files, which should be plenty.

The remainder of the disk is usable for file storage, for a total storage space of 400 - 8 = 392k.

Tuesday, April 3, 2012

Commodore Disk Image headers, again

One minor nitpick about Commodore disk images is that they have no signature line.  The only way you can tell what they are is to look at the extension, the file size, and perhaps try to jump to the header sector and "see" if it looks right.  While this is not a major problem, I think there is a simple solution; namely, to add a signature to each disk image.

A signature is a small, initial data set which you can use to determine the nature of the disk unconditionally.  My suggestion is to look for an optional 32 byte signature on all Commodore images; if it proves useful, then over time all such images will have this signature.

Examples.

D64 images will start with "1541 DISK IMAGE ".
D71 images will start with "1571 DISK IMAGE ".
D81 images will start with "1581 DISK IMAGE ".
D82 images will start with "8250 DISK IMAGE ".

...and so on.

The remaining 16 bytes should be used to specify the image configuration as clearly as possible.  For example, the D64 should have a byte for how many tracks are present (i.e. 35, 40, or some other number), and a byte indicating whether or not error bytes are appended to the end of the image.  I would also suggest another byte used to indicate an auxiliary directory track, but that starts to make things complicated.

As I said, this data can be inferred from the image itself, but it is much better to be explicit, and the simplest way to do that is to lead with a short signature block.

Having a "number of tracks" byte could be space-efficient as well, because many images have content  much smaller than the disk's capacity; in these cases it would be possible to publish a smaller D64. Since the 18th track is required, the smallest D64 would be 18 tracks long, or about 95k -- almost half the size of the standard D64.

Wednesday, March 28, 2012

Java to ActionScript (via Perl)

#!/usr/bin/perl

while(<>)
{
   s/\bfinal//;
   s/\b(int|long) (\w+)/ var $2:int/;
   s/\bboolean (\w+)/ var $1:Boolean/;
   s/\bString (\w+)/ var $1:String/;
   s/System.out.println/trace/;

   s/ (void|int|String) (\w+\(.*?\))/ function $2:$1/;


   print;
}


The Commodore 1541 disk drive is a computer, with a 6502 microprocessor and its own RAM.  It talks to the Commodore 64 via a hastily-built proprietary serial variant of the IEEE488 bus.

And it's a pain to emulate.

Luckily, it's a solved problem, more or less, if your chosen programming language is C++ or Java.  If you want to do it in, say, ActionScript, then you are out of luck.

...unless you know Perl.

ActionScript, as you may know, has a fuzzy relationship with Java.  Its compiler is written in Java.  Its VM may very well be based on the JVM.  So it is no surprise that ActionScript source is in many ways a cipher of Java.

I wrote a very small Perl script to convert Java source to ActionScript source.  It doesn't do a 100% job, but in all things the best is the enemy of the good, and the Burrito Principle holds (80% of the meat is in 20% of the burrito).  So this gets me most of the way there, leaving small scraps to deal with (instead of facing a complete and more tedious rewrite).

Saturday, March 24, 2012

I like Objective C

So far.  I'm not sure if it's as accessible as ActionScript, but I really appreciate its strict adherence to Design Patterns.  Just a few lessons in, and we've already done MVC and Delegates.

And, of course, I always loved the Smalltalk syntax.

Friday, March 2, 2012

Time for a new internet browser

Time to be an old grump for a moment.

I've said it before, I'll say it again.  It's time to rewrite the browser.  Invent, create, realize a new way of browsing the internet.

Forget HTML, JavaScript, FlashPlayer et al.  Computers are powerful; why aren't browsers?  Why can't you develop on the browser the same way you develop directly onto the operating system?  Why isn't there a virtual machine to which you may directly target compilers?  That way, you have your cake and can eat it, too.

I'm not saying the browser should be an operating system; it's an application.  However, it should integrate with operating systems.  For example, security is an OS problem; it should not be an application's problem.  Why solve the same problem over and over again?  There are realtime impacts to this: HTTP and HTTPS are heavy compared to TFTP.

I am saying that HTML is annoying.  I don't think HTML5 will solve that problem - at least, it won't solve it anytime soon.  HTML is to the browser like Java is to the OS: it's a language, a display and layout language.  It defines the View.

Thursday, February 23, 2012

CargoCult, part one

This is a post about my dream language, which I've named CargoCult.  It's a mashup of Perl, Objective-C, JavaScript, Shell, and other things.


It does NOT eschew the use of shifted characters -- it just requires that they be important, with a value greater than the extra effort of typing shift + something.


Object Notation

CargoCult is a dynamic object language.  This means you have type-able structures, potentially dynamic, which have attributes and methods.

Core language features -- arrays, hashes, variables -- are objects.  For example, the implicit array type is an object, so you can do things like this:

    return [d1, d2, d3].sort.reverse.pop; 


Hashes and arrays use the grouping notation of braces. An array is a comma-separated list of scalars.  A hash is a comma-separated list of assignments.

     my array = 1, 2, 3, 'four';  # also [ 1, 2, 3, 'four' ]
     my hash  = [year = 2012, month = two, day = 23];


Hash and array accesses are object calls.

    my value = hash.year;
    my other_value = array.0;


Method Calling with Parameters

When we write methods in any language, we typically name formal parameters.  For example:

string myFunction( foo, bar )
{
   foo + ': ' + bar;
}

foo and bar are formal parameters, i.e. the names used in the method.

When calling a method, parameters are passed in by name.  In other words, the parameters are more or less a hash.

my str = myObj myFunction .foo 'hello' .bar 'world';


When you have to nest the call, use the backslash to indicate a method call (rather than a new array), and then braces for grouping.

           my str = myObj myFunction .foo \[myObj myFunction .foo 'hello' .bar 'world'] .bar '!';