What about creating a hierarchical CSV standard?

< Next Topic | Back to topic list | Previous Topic >

Posted by Lucas
Jul 21, 2024 at 06:13 PM

Some outliner programs have database-like capabilities, allowing for extensive metadata/columns (programs such as InfoQube, Tinderbox, and Tana). And, recently, more database-like programs have been adding the ability to create hierarchical relationships between items (programs such as Notion, Zenkit, etc.). I’ve been thinking for a long time that the ideal for interoperability would be to develop CSV-based standard for indicating hierarchical relationships.

(OPML is great, but, at least from my layperson’s perspective, it doesn’t seem especially suited to handling databases or to indicating when items have multiple parents. And it’s not so easy for a layperson to edit complex OPML files directly.)

Meanwhile, non-outliner programs like Notion and Zenkit each take their own approach to indicating hierarchical relationship metadata upon CSV export. Some show item parents as a column, some show item children as a column, some show both. Some show these relationships using item names, some show them using item IDs, and so forth.

So, in this regard, it seems to me that, for the sake of interoperability, there would be room for someone to propose a CSV-based standard for hierarchical text-based databases (or text-based outlines with extensive metadata/columns). This would then incentivize developers to make it possible to import from and export to this standardized format. Maybe this is just a pipe dream—I don’t work in the software field, so I may be naive—but I’m curious what others think.

(Of course, programs that do not allow for multiple parents could still use this standard. They would just decide how to handle such information upon import. But my idea is that the standard would establish an agreed-upon method for indicating multiple parentage when needed.)

Posted by Andy Brice
Jul 21, 2024 at 08:06 PM

Speaking as a software developer who has written CSV parsers, CSV is a farly horrible format, for all sorts of reasons, including:

-loosely defined standard (e.g. doesn’t specify text encoding)
-difficult to parse multi threaded due to ‘escaping’
-a single wrongly placed quote can corrupt everything afterwards
-no provision for comments or meta data

It has also been poorly implemented in many apps.

It’s one saving grace is that you can easily edit it with a standard text editor.

It is pretty easy to define hierarchical relations in any tabular format, including CSV. Just have a unique ID for each row and use to reference child and/or parent rows. Whichever is more convenient. I don’t see any real need for a standard.