Thursday, October 13, 2011

YSON is a Structured Object Notation

Here's my proposed structure.  I want to keep the quotes to a minimum.

records:
[
    !record record:
    {
      !game &001 game:
      {
        date: 'March 2, 1962',
        versus: New York
      },
      notes: Awesome!,
      number: 100,
      !player &002 player:
      {
        name: Wilt Chamberlain,
        team: Philadelphia
      },
      record: Most points single game
    },
    
    !record record:
    {
        game: *001,
        number: 59,
        player: *002,
        record: 'Most points, one half'
    }
]


object
    hash

hash
    {}
    { pairs }

pairs
    key: value
    key: value, ...

key:
    string
    &id string
    !type string
    !type &id string
    
list
    []
    [ elements ]

elements
    value
    value, elements

value
    key
    string
    number
    object
    hash
    list
    true
    false
    null

Toward a common script grammar

It seems like scripting languages are slowly converging on a small number of common grammars and one structural data representation.


Common Grammars

The two common grammars are (1) simplified C and (2) Smalltalk.

More to the point, the simplification of C includes removing those pesky parentheses in control flow expressions.  I see them dropping everywhere, which I think is probably a good thing.

Anyway, it seems reasonable that most scripting languages' identities are bound by their feature sets and runtime details, rather than their basic grammar, and therefore it's only beneficial that they should converge.  Naturally it won't happen, but it would be nice.  

Alternately, it seems like an ideal scripting language would be able to understand many variants of C grammar, if only because the grammars are so closely related and are, essentially, representing solved problems Solved problems should not have variant grammars.


Common Structure

The structural data rep has little variations, but is essentially curly brackets to hold key-value pairs, a pair operator, and square braces to handle lists.  The data model is more powerful than JSON, and less powerful than YAML, but otherwise looks like them.  Call it JSON with references, or a strict, sane subset of YAML of scalars, lists, hashes, and references.  YSON, perhaps.

Seems that an ideal scripting language would therefore allow the user to specify the pair operator, key decoration, and value decoration.  Should take care of the major players.
 

Monday, October 3, 2011

Deneb sourcebook

...is crawling along, sometimes painfully slowly. But I'm still on schedule, thanks to giving myself lots and lots of time.  I'm currently working on "Fleets" - how duchies and smaller states in the sector squabble with each other by tradewar and skirmishings, and how they do it without provoking an Imperial response.

Wednesday, September 7, 2011

I've got no time to code Perl (or anything else)

I need a less demanding job.

One that pays what I'm getting paid, but lets me work less.

Yeah, right.

Funny, but twenty years ago, before I had a house and a family, I still didn't feel like coding once I got home from work. I suspect the reason for this is that the job sucks all the code out of me during the day, leaving my brain tired and ready for downtime.

I wish it weren't so.

The only solution is to work smarter, both at work and at home. To develop programming tools which will build a foundation upon which I can do what I want. Tools that can be written modularly, a little bit at a time. Bottom-up programming.

Perl, Javascript, and Traveller

For a couple years I've had a suite of Javascript pages for building stuff for Traveller, 5th edition. They use Traveller's new rules for stacking attributes onto a base item, resulting in myriads of armor, weapons, vehicles and smallcraft.

Very recently, I added a backend Perl component which will email the results to you.  So say you define a fast civilian grav truck which can reach orbit on your iPad.  Just enter your email address and it will email the results to you.

I wrote that bit because I wanted to use my scripts from my iPad, and didn't want to rely on cut-and-paste.

A link to the scripts is here: http://eaglestone.pocketempires.com/scripts/armormaker.html

Bioinformatics and Perl

In order to do bioinformatics programming, you have to know a LOT about genetics.  You have to know the terminology and cell, uh, let's call it 'mechanics' for now, until I know what the official Latin-based fourteen-syllable word really is.

You also have to know Perl.

I know Perl.  Maybe I can learn the language of medical science.  Maybe I can help.

Maybe it will help improve the standard of living for people with special needs.

If you had told me back in 1994 that learning Perl might aid research that could help my daughter, I would have laughed. Perl was a fun hobby. It helped me get write scripts that got work done faster so that I could spend more time writing scripts for Traveller. It was not really practical in a larger sense.

Monday, August 15, 2011

PushButtonEngine mini-reference

I have plans, always plans, to churn out some apps for the iPad. Some of those are conversions of the old 8-bit games I wrote for the Commodore 64 for my own geeky pleasure. In those cases, I want to use what I already know -- ActionScript development -- with a bit of help; that is, PushButtonEngine.

So this, a very brief, very basic component reference for me.  And an order:

(1) allocate the entity
(1a) make sure the "layerIndex" is correct... lower = further in the background
(2) create the spatial component
(3) create the sprite component
(4) create the controller component, pointing to the spatial comp.

ref: http://blog.flashgen.com/gaming/pushbutton-engine/pushbutton-engine-working-with-bitmap-sprites/




SimpleSpatialComponent - sets size and position of the Entity. 
SimpleShapeRenderer - assigns a basic shape (circle, square) to the Entity. 
Tends to "bind" to the Entity's size, position, rotation.
 
SpriteRenderer - what we actually use to represent an Entity. 
 
TickedComponent - gets called at each game tick. 
Used for controlling an Entity by listening to input, for example.
 
A flexible way to handle input is to map them to handlers.
 
public class InputMapKeyboardController extends TickedComponent
{...}
 
ref: http://blog.flashgen.com/gaming/pushbutton-engine/pushbutton-engine-handling-user-input/

Wednesday, July 20, 2011

ANT is insane

So I'm learning how to use ANT. Or, rather, I'm learning the insane limitations of ANT tasks.

Here's how it works. Every ANT task has its own API, its own behavior, and a very limited way of operating.

Example: foreach. I cannot pass in a list of module names, referenced from an XML Property file, into foreach for processing. Can't be done. Why? Because the coder didn't code that in.

But why did the coder have to specify the types of data a foreach statement could take? Why is a generic iterator required to go beyond iteration? And yet foreach does; apparently it must, presumably because ANT has insufficient expression power.

If that's a fundamental limitation of the ANT core system, then why didn't the original writer write ANT in a more generic fashion? This blows me away. It's insane.

If we're going to create a de facto script language out of XML, let's do it right, folks.

(1) Accept a LIST, and nothing more. If this means ANT needs a foundational LIST type, then so be it. If this means foreach needs to be rewritten to accept a LIST and only a LIST, then so be it. To do anything else is INSANE.

(2) Define tasks to be performed inside the foreach. After all, this is control flow. Don't make me perform an assembly-language-like JUMP. Do It Right.

<foreach list="${my.modules}" param="item">
<compile sourcePath="${item.sourcePath}"
destPath="${item.destPath}"/>
</foreach>

Friday, July 15, 2011

i64 Diskette Image header

A "D64 file" is an image file of a real Commodore 1541 diskette. D64s, and related formats, have been around since the 90s. In a previous post, I explained that it might be nice to see a more flexible, parametric approach to Commodore disk images.

I'm currently calling this format "i64", although I haven't fully settled on it.

An old draft of the format is here.

In short, it's simply an optional header block, with parametric data explaining the structure of the image and important offsets.

It's present in any disk image which is of a non-standard size.  For example, if a .d81 file is not the correct size for a d81 image, then look at the first block for the i64 custom header. If it's present, it's located in the first 256 bytes of the disk image -- where a header ought to be located. Otherwise, drive on as usual.

This fancy new header tells us very explicit things about the image.

Structure is 8 bytes, and lays out the four zones of every Commodore disk: the number of tracks per zone, and the number of sectors per track within that zone. Also whether the disk is double-sided (thereby duplicating the zone data for side 2).

Error data can either be prepended to the actual tracks, or appended.

The locations of the disk header block, directory and BAM are required. Offsets to the header and BAM data are required, and how the BAM is stored in the case of dual-sided disks, or indeed if the BAM exists at all.

An autoboot BAM offset, if any. A boot track, if any.

Interleaves.

The disk's default format type and DOS type.

Whether the REL field is used for the LSU.
Whether the file's timestamp is stored in the directory entry.
Whether the Directory is allowed to freely grow and range, like any other file.
Whether the image is allowed to grow dynamically, or is pre-allocated on creation.

The future of the D64 format

The Commodore 64 is an ancient 8-bit system, enjoying a niche fanbase of aging gen-Xers and the youngest Boomers. Emulators are sophisticated and sufficient. You can run games off of a Commodore disk drive attached to your Mac or PC or Linux box. You can run games off of a Commodore computer attached to your home computer acting like a disk drive. You can buy SD-card-reading hardware that replaces a Commodore disk drive. You can buy a joystick that has an embedded C64 burned into its chips. Commodore diskette and hard drive images are insignificantly tiny, compared to today's storage capacities.

What is there to improve on, and why bother?

The second question is the easiest to answer: because the problem space is interesting and well-scoped.

The first question requires creativity. However, I do have some ideas.

Networked "UDP1541"

Current systems bundle all software into one application. Your C64 emulator includes disk and tape support at some level. Some are very sophisticated.

IEC emulation requires effort. It's painful. So I say, if someone has gone to the effort, make it shareable: split the disk drive into a separate application, and communicate via a primitive IEC-friendly protocol over UDP. Not only would this be a fun project, it would also allow emulators to use these "devices" even if they were programmed on a completely different platform. The FC64 AIR app could use the VICE UDP1541.

If you really must, then write it as TCP1541, i.e. using TCP/IP instead of UDP. In this way the C64 emulators may access Commodore drives located on any server anywhere on the internet.

3rd Generation Image Support

The venerable D64 works for 99% of all emulator needs. The G64 works for the remaining 1%. So why is another format needed?

Think of it the other 'way 'round. Emulated drives are common. The D64 format is somewhat irrelevant; it's the emulation support that's more important. An opportunity arises to support the 1541's quirks while not being constrained by it.

So rather than a new format, I suggest that disk drive emulators should be highly parameterized, and tweaked to read those parameters from disk images when present.

I see three areas of compatability: (1) small programs, (2) large programs, and (3) experimentation.

First, MY Solution

My solution is an image header block: the first block, if identified by an identification label and image configuration parameters, would not only tell any reader exactly where all data is located, but also the structure of the data itself -- for example, the number of tracks in each of the four zones, how the BAM is stored in special cases, if error bytes are prepended or appended, if the image is allocated fully or is permitted to grow, and so on. This information and more sits easily within one block of data, and essentially paramaterizes a disk image reader.

I think the best-case scenario would be for disk drive emulators to look for this block on any mounted image that's not the correct size for a typical image file.


1. Small Programs

Where you have small programs, and yet want to remain compatible with a disk-image milieu, it would be nice to have smaller disk images. This argues for a flexible file format at least. One way to do this is to reorganize the disk tracks, so that the header+directory track comes first, and the remaining tracks are added only as needed, in write-order, based on an explicit mapping.

Of course, if you have the flexible image format, all you have to do is define the format, and you're done.

2. Large Programs

The other potential gain is in large, multi-disk programs. In this case, it would be nice to be able to store more than 174k in one image. You could define a larger disk image in some cases.

However, I think a better solution in many (but not all) cases is for emulators to understand the TAR format, and use archiving to group related disks together.

3. Experimentation

This to me is the funnest part. If a disk reader is highly parameterized, there are lots of custom images you can make, and you'd be free to explore the space without worrying about support.

I have custom image formats that I've played with, and found that a little bit of parametric data goes a long way. While not immediately practical, there is potential for interesting formats. Perhaps this is the best way to archive programs, too: let the file size dictate how many zones, how many tracks, how many sectors to have.

Tuesday, June 14, 2011

How to parse D64 files, part 2: file chains

Last time, we saw how to crack open a D64 file with Perl and get at the first directory block. You may recall, two of the bytes in a directory entry point to the first block of the file proper, by way of a Track number and Sector number.

Now let's see how to read files in general.

A file in a Commodore disk image is stored in blocks. Those blocks may be scattered across the disk; in fact, due to mechanical considerations, files were almost never written in a contiguous set of blocks. Instead, a block was written, then the next block was written a few sectors away, then nthe next a few more sectors away, and so on.

When a block is read, the next block's location is found in the first two bytes. The remaining 254 bytes are file data proper.

So then, suppose a directory entry indicates that a file begins at track $t, sector $s. Using Perl, the file could be reconstructed in a manner similar to this:

my $fileData = readFile( $startTrack, $startSector );

readFile( $buffer, $t, $s )
{
return $buffer unless $t;
my $byteOffset = 256 * ($sectorOffset[ $t ] + $s);
($t, $s) = unpack "CC", substring( $diskImage, $byteOffset, 2 );
$buffer .= unpack "C*", substring( $diskImage, $byteOffset + 2, 254 );
return readFile( $buffer, $t, $s );
}


@sectorOffset will require some more explanation.

Monday, June 13, 2011

How to parse D64 files, part 1

This is a short how-to for parsing commodore 1541 diskette images, colloquially called D64 files after their 3-character suffix.

These instructions assume you know how to program in at least one C-derived programming language. I will be using Perl, but it will be very C-like.

STEP ONE: install Perl. A good and free distribution can be had over at http://activestate.org.

STEP TWO: fetch some D64 files. One likely place is http://lemon64.com.

Now it's time to write some code. D64 files are simply data laid out the way a 1541 would see it as it reads it straight from the disk, from the beginning to the end. So the code will have to navigate the structure of the 1541 diskette format. First though, let's slurp the entire image into a buffer..


my $filename = 'whatever.d64';
my $filesize = -s $filename;

open IN, $filename;
binmode IN;
my $buffer;
read( IN, $buffer, $filesize );
close IN;


Okay, we've got the disk in a buffer. Now what? Now we wrest the structure from this pile of bytes. That structure begins with the DIRECTORY. The directory offset is 18 * 21 + 1 blocks in, and a block is 256 bytes. So let's put that in a variable for later.


my $directorySector = (18-1) * 21 + 1;
my $offset = 256 * $directorySector;


Now we need to parse the actual data from the directory. The directory block consists of eight entries of 32 bytes each. Each byte has a meaning, some point to locations on the disk, some are part of a filename, some are status bytes for the file, etc. I will iterate over the directory block, 32 bytes at a time, and at each iteration unpack some of the current 32 byte structure into its component values:


for ( my $j=0; $j<256; $j+=32 )
{
my ($dirtrack, # unsigned Char
$dirsector, # unsigned Char
$type, # unsigned Char
$track, # unsigned Char
$sector, # unsigned Char
$filename) # 16-character ascii string
= unpack 'CCCCCa16', substring( $buffer, $offset + $j );

print "$filename [$type] is located at $track/$sector\n";
}


One very important pair of data in the above structure are $track and $sector. They tell us where to find the first block of that file.

This is a good stopping point. We've taken a D64 file, read it in, and printed out the contents of the first directory sector -- I.e. The first 256-byte block of the directory. Next time, we'll see how to read in an entire file.

Friday, May 27, 2011

Main Points from Adobe

So I attended a day-long presentation on Acrobat and Creative Suite 5.5 recently. Here are the primordial take-aways I got out of it:

* mobile is "big" for Adobe (now). Really!
* the typical information worker spends 17 hours a week *creating* content.
* rich media will be 25% of content by 2013.

* HTML5 is not a document format.

^ That's something which occurred to me during the Acrobat-portion of the presentation.

* process improvements and reducing costs are the top priorities in companies.
* process improvement leads to productivity gain.
* reduced costs are achieved via standards, best practices.

* that's what drove Adobe's work on Acrobat X. Editing content, table data, headers, OCR, sharepoint integration, highliter and sticky-notes, PPT to PDF, embedded QT and MPG, commenting tools, forms, digital signatures, document comparison, a macro system, legal and governmental electronic document standards support, et al.

That's it in a nutshell.

Oh yes: Flex/ActionScript content is maybe 10% of Adobe's business.

Friday, May 6, 2011

Modern Perl

I've been talking with an acquaintance the other day about Modern Perl -- i.e. the current way of coding Perl.

Perl's very flexibility is evident, since code written in 1994's Perl will still run on 2011's Perl, and yet you can do things today that you couldn't do back then.

This is due to at least two things:

(1) there's an active core development of Perl, which extends the language in useful but backwards-compatible ways (this is possible partly due to the power, flexibility, and dynamism of Perl itself),

and

(2) there's active module development by the community to meet the programming needs of the community. CPAN continues to astound me. There is nothing with the scope and depth of CPAN anywhere for any other programming language.


Languages like Scheme and Python will insist on a core orthogonal operating set. And with some languages (like SmallTalk), you can go a long way on that. But let's face it, you have to get work done sooner or later. Even Common LISP understood that. Perl's amazing library of modules lets you do anything you like... even emulate and embed other languages.


Say you've got a relatively clean, small (and therefore embeddable), C-like language which includes hashtables and some improved control structures, used for scripting small actions. As a loose superset of these sorts of languages, Perl is ideal for aggregating and linking libraries, finding and factoring out common code fragments, and even generating the code and acting as a test harness. In fact, Perl modules could exist today for performing most of these functions -- a Git module could double as a library front end; a text tool could find redundant code; and a template engine could generate code from a JSON structure.

You can't do that with Lua itself. But you could do it all with Perl. And since Perl's syntax descends from C, it's easy to pick up Perl as another tool in your belt.

Thursday, May 5, 2011

HandyWrite - the Ultimate Shorthand System

It's time for an app that understands shorthand. It's time to bring shorthand back.

Check this out:

http://www.alysion.org/handy/handywrite.htm

This guy designed what looks like a near-optimal shorthand system:

1. He started with Gregg shorthand. Note that just using Gregg letters alone you'll speed up your note-taking significantly. The downside is that the notes themselves are often abbreviations, so you have to transcribe soon after you've taken the notes or else it becomes a bit opaque.

2. He extended it to represent the full range of English pronunciation. Now you have 1:1 sound-correspondence to English. You can read back what you wrote years later. i.e. It's a full writing system.

3. He then added a symbology for just 100 of the most common English words. Not at all as extensive as classical shorthand, but all you need is to be able to record notes as fast as they're spoken.


Ok, the best thing about it in my opinion is that it honors and re-uses prior art in simply extending Gregg shorthand. That alone is worth something in my book -- and not just for sentimental reasons: Gregg is a thoughtful and elegant system.

I've jotted down the letter forms, and will be practicing them little by little as I have time. My goal is to be able to record meetings with it.

Actually, my real goal is to write an app that will let me take notes with it.

Perl Programming

I'm a Perl fanatic. Ever since I learned of it waaay back in 1995, I've been able to do amazing things with it, primarily in my job function as a programmer.

As a programmer, I face two daily tasks. First is writing code to accomplish something for the company, i.e. business logic. Web pages for example. The other task is managing piles and piles of data that a company has to manage.

1. Programming requires writing code. Believe it or not, at every job I have used Perl at one time or another to write code for me. Perl is wonderful for code generation. The code I've generated the most of using Perl is: Java.

2. Data is strewn in multiple formats in multiple places. And Perl is supreme at data mining. Whether I'm scraping web pages or tossing binary digits, Perl is awesome.

So if you have to handle data, I suggest Perl is the best fit for the job. It's not that you can't do it with other languages (I've used, and still use, Java, Python, C, AWK, SmallTalk, JavaScript, ActionScript, Ruby, Flex, and LISP); it's just that Perl is handier.