The core text management data structure which supports efficient modifications and provides a byte string interface. Text positions are represented as size_t. Valid addresses are in range [0, text_size(txt)]. An invalid position is denoted by EPOS. Access to the non-contigiuos pieces is available by means of an iterator interface or a copy mechanism. Text revisions are tracked in an history graph.


The text is assumed to be encoded in UTF-8.


Text *text_load(const char *filename)

Create a text instance populated with the given file content.


The new Text object or NULL in case of an error.


When attempting to load a non-regular file, errno will be set to:

  • EISDIR for a directory.

  • ENOTSUP otherwise.

  • filename: The name of the file to load, if NULL an empty text is created.

void text_free(Text*)

Release all ressources associated with this text instance.


size_t text_size(Text*)

Return the size in bytes of the whole text.

struct stat text_stat(Text*)

Get file information at time of load or last save, whichever happened more recently.


If an empty text instance was created using text_load(NULL) and it has not yet been saved, an all zero struct stat will be returned.


See stat(2) for details.

bool text_modified(Text*)

Query whether the text contains any unsaved modifications.


bool text_insert(Text*, size_t pos, const char *data, size_t len)

Insert data at the given byte position.


Whether the insertion succeeded.

  • pos: The absolute byte position.

  • data: The data to insert.

  • len: The length of the data in bytes.

bool text_delete(Text*, size_t pos, size_t len)

Delete data at given byte position.


Whether the deletion succeeded.

  • pos: The absolute byte position.

  • len: The number of bytes to delete, starting from pos.

bool text_delete_range(Text*, Filerange*)
bool text_printf(Text*, size_t pos, const char *format, ...)
bool bool text_appendf(Text *, const char * format, ...)


The individual pieces of the text are not necessarily stored in a contiguous memory block. These functions perform a copy to such a region.

bool text_byte_get(Text*, size_t pos, char *byte)

Get byte stored at pos.


Whether pos was valid and byte updated accordingly.


Unlike text_iterator_byte_get() this function does not return an artificial NUL byte at EOF.

  • pos: The absolute position.

  • byte: Destination address to store the byte.

size_t text_bytes_get(Text*, size_t pos, size_t len, char *buf)

Store at most len bytes starting from pos into buf.


The number of bytes (<= len) stored at buf.


buf will not be NUL terminated.

  • pos: The absolute starting position.

  • len: The length in bytes.

  • buf: The destination buffer.

char *text_bytes_alloc0(Text*, size_t pos, size_t len)

Fetch text range into newly allocate memory region.


A contigious NUL terminated buffer holding the requested range, or NULL in error case.


The returned pointer must be free(3)-ed by the caller.

  • pos: The absolute starting position.

  • len: The length in bytes.


An iterator points to a given text position and provides interfaces to adjust said position or read the underlying byte value. Functions which take a char pointer will generally assign the byte value after the iterator was updated.

struct Iterator

Iterator used to navigate the buffer content.

Captures the position within a Piece.


Any change to the Text will invalidate the iterator state.


Should be treated as an opaque type.

Iterator text_iterator_get(Text*, size_t pos)
bool text_iterator_valid(const Iterator*)
bool text_iterator_next(Iterator*)
bool text_iterator_prev(Iterator*)



For a read attempt at EOF (i.e. text_size()) an artificial NUL byte which is not actually part of the file is returned.

bool text_iterator_byte_get(Iterator*, char *b)
bool text_iterator_byte_prev(Iterator*, char *b)
bool text_iterator_byte_next(Iterator*, char *b)
bool text_iterator_byte_find_prev(Iterator*, char b)
bool text_iterator_byte_find_next(Iterator*, char b)


These functions advance to the next/previous leading byte of an UTF-8 encoded Unicode codepoint by skipping over all continuation bytes of the form 10xxxxxx.

bool text_iterator_codepoint_next(Iterator *it, char *c)
bool text_iterator_codepoint_prev(Iterator *it, char *c)

Grapheme Clusters

These functions advance to the next/previous grapheme cluster.


The grapheme cluster boundaries are currently not implemented according to UAX#29 rules. Instead a base character followed by arbitrarily many combining character as reported by wcwidth(3) are skipped.

bool text_iterator_char_next(Iterator*, char *c)
bool text_iterator_char_prev(Iterator*, char *c)


Translate between 1 based line numbers and 0 based byte offsets.

size_t text_pos_by_lineno(Text*, size_t lineno)
size_t text_lineno_by_pos(Text*, size_t pos)


Interfaces to the history graph.

void text_snapshot(Text*)

Create a text snapshot, that is a vertice in the history graph.

size_t text_undo(Text*)

Revert to previous snapshot along the main branch.


Takes an implicit snapshot.


The position of the first change or EPOS, if already at the oldest state i.e. there was nothing to undo.

size_t text_redo(Text*)

Reapply an older change along the main brach.


Takes an implicit snapshot.


The position of the first change or EPOS, if already at the newest state i.e. there was nothing to redo.

size_t text_earlier(Text*)
size_t text_later(Text*)
size_t text_restore(Text*, time_t)

Restore the text to the state closest to the time given.

time_t text_state(Text*)

Get creation time of current state.


TODO: This is currently not the same as the time of the last snapshot.


A mark keeps track of a text position. Subsequent text changes will update all marks placed after the modification point. Reverting to an older text state will hide all affected marks, redoing the changes will restore them.


Due to an optimization cached modifications (i.e. no text_snaphot was performed between setting the mark and issuing the changes) might not adjust mark positions accurately.

typedef uintptr_t Mark

A mark.


An invalid mark, lookup of which will yield EPOS.

Mark text_mark_set(Text*, size_t pos)

Set a mark.


Setting a mark to text_size() will always return the current text size upon lookup.


The mark or EMARK if an invalid position was given.

  • pos: The position at which to store the mark.

size_t text_mark_get(Text*, Mark)

Lookup a mark.


The byte position or EPOS for an invalid mark.

  • mark: The mark to look up.


enum TextSaveMethod

Method used to save the text.



Automatically chose best option.


Save file atomically using rename(2).

Creates a new file named filename~ and tries to restore all important meta data. After which it is atomically moved to its final (possibly already existing) destination using rename(2).


This approach does not work if:

  • The file is a symbolic link.

  • The file is a hard link.

  • File ownership can not be preserved.

  • File group can not be preserved.

  • Directory permissions do not allow creation of a new file.

  • POSXI ACL can not be preserved (if enabled).

  • SELinux security context can not be preserved (if enabled).


Overwrite file in place.


I/O failure might cause data loss.

bool text_save(Text*, const char *filename)

Save the whole text to the given file name.

bool text_save_range(Text*, Filerange*, const char *filename)

Save a file range to the given file name.

TextSave *text_save_begin(Text*, const char *filename, enum TextSaveMethod)

Setup a sequence of write operations.

The returned TextSave pointer can be used to write multiple, possibly non-contigious, file ranges.


For every call to text_save_begin() there must be exactly one matching call to either text_save_commit() or text_save_cancel() to release the underlying resources.

ssize_t text_save_write_range(TextSave*, Filerange*)

Write file range.


The number of bytes written or -1 in case of an error.

bool text_save_commit(TextSave*)

Commit changes to disk.


Whether changes have been saved.


Releases the underlying resources and free(3)’s the given TextSave pointer which must no longer be used.

void text_save_cancel(TextSave*)

Abort a save operation.


Does not guarantee to undo the previous writes (they might have been performed in-place). However, it releases the underlying resources and free(3)’s the given TextSave pointer which must no longer be used.

ssize_t text_write(Text*, int fd)

Write whole text content to file descriptor.


The number of bytes written or -1 in case of an error.

ssize_t text_write_range(Text*, Filerange*, int fd)

Write file range to file descriptor.


The number of bytes written or -1 in case of an error.


bool text_mmaped(Text*, const char *ptr)

Check whether ptr is part of a memory mapped region associated with this text instance.