j.ohnson.com

Building a Text File Database Part 3: Writing Data

After separating out domain concerns from file access I felt ready to approach writing data to the file.

Adding a new Record

To add data my first instinct was to require constructing a new Record instance and passing it into FileDb. But this would require retooling the internals of the Record object to allow new state to be added to it. Currently it knew how to lazy load from the file so this would be a large departure from how it's currently used. I thought about pulling Record up to an interface and then having implementors ExistingRecord and NewRecord but this felt too complex already. I ultimately decided to go with a simple Map for the time being:

    public Record put(Map<String, String> record) {
        ByteBuffer buffer = ByteBuffer.allocate(recordLength);
        String rawRow = columns
                // For each of the defined columns, in order
                .stream()
                .map(column -> {
                    // get the value for that column, default to blank space since we'll need something in the column
                    String value = record.getOrDefault(column.name, "");
                    if (value.length() > column.size) {
                        throw new IllegalArgumentException("Field value (%s) is too large for column size (%s): %s".formatted(value.length(), column.size, value));
                    }
                    // pad the value with empty spaces
                    return "%s%s".formatted(value, " ".repeat(column.size - value.length()));
                })
                // separate each field with a |
                .collect(Collectors.joining("|"))
                + "|\n";

        buffer.put(rawRow.getBytes());
        buffer.flip();
        Future<Integer> op;
        long recordCutPoint;
        try {
            recordCutPoint = fileChannel.size();
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
        // write record at the end of the file
        op = fileChannel.write(buffer, recordCutPoint);
        try {
            op.get();
        } catch (InterruptedException | ExecutionException e) {
            throw new RuntimeException(e);
        }

        // re-read the record as a sanity check and send back
        return readRecord(buffer, recordCutPoint);
    }

The API isn't 100% clear about what the Map is but I don't think using a Record would have helped with that. Maybe a little a la new Record().addField("title", "The Man in the High Castle") but this is fine for now.

Updating a field

Overwriting a field ended up being a breeze with this setup:

    public Record update(Record record, String field, String newValue) {
        ByteBuffer buffer = ByteBuffer.allocate(recordLength);
        Item item = record.get(field);

        buffer.put("%s%s".formatted(newValue, " ".repeat(item.column.size - newValue.length())).getBytes());
        buffer.flip();
        Future<Integer> op;
        op = fileChannel.write(buffer, item.startPosition);
        try {
            op.get();
        } catch (InterruptedException | ExecutionException e) {
            throw new RuntimeException(e);
        }

        return readRecord(buffer, record.startPosition);
    }

I don't even need to check for field sizes that are too large as the repeat() function would end up with a negative number and throw an exception. But I will just to not confuse myself in the future.

A note on the field separator: |

From the get-go I had a concern about using the pipe (or any) symbol to separate field values. Although it was unlikely for an author or book title to contain a | I was starting to consider this format for other small projects. But that issue has taken care of itself. Records were now being read with fields represented by Items so that they could have a reference to their Column which had a reference to their size. This was done to ease updating of fields but had the side effect of forcing to not split on | and instead read in the fields by going through the list of Columns and reading from recordStartPosition + itemStartPosition. At this point the field separator could be anything and only aided in human readability of the file. The only place it mattered was the first line for field definitions.