Influx Continuous Dev Log

Developing Influx

, a content-based, self-guided, NLP-enhanced integrated language learning environment emphasizing language exposure and active learning.

H2 2025-05-25

Annotated text rendering and range selection is back, rewritten in Elm

H2 2024-06-23

Tokenisation model outputs are now cached for faster load time (apparently building the document tree is still somewhat slow)

First load was about 16 frames, second load (cached) took 12 frames.

H2 2024-05-12

Open MacOS’s dictionary
Use Google Translate’s API
Slicing the document

There’s some issue with NLP server becoming zombie process when Influx terminates and documents not loading, but we managed to 1. turn SvelteKit development server into static package 2. embed compiled Python server in Tauri as a sidecar 3. spawn an Axum API server from Tauri’s process

{
	"type": "PhraseToken",
	"sentence_id": 1,
	"text": "world wide web",
	"normalised_orthography": "world wide web",
	"start_char": 70,
	"end_char": 84,
	"shadowed": false,
	"shadows": [
		{
			"type": "SingleToken",
			"sentence_id": 1,
			"id": 12,
			"text": "world",
			"orthography": "world",
			"lemma": "world",
			"start_char": 70,
			"end_char": 75,
			"shadowed": true,
			"shadows": []
		},
		{
			"type": "Whitespace",
			"text": " ",
			"orthography": " ",
			"start_char": 75,
			"end_char": 76,
			"shadowed": true,
			"shadows": []
		},
		{
			"type": "SingleToken",
			"sentence_id": 1,
			"id": 13,
			"text": "wide",
			"orthography": "wide",
			"lemma": "wide",
			"start_char": 76,
			"end_char": 80,
			"shadowed": true,
			"shadows": []
		},
		{
			"type": "Whitespace",
			"text": " ",
			"orthography": " ",
			"start_char": 80,
			"end_char": 81,
			"shadowed": true,
			"shadows": []
		},
		{
			"type": "SingleToken",
			"sentence_id": 1,
			"id": 14,
			"text": "web",
			"orthography": "web",
			"lemma": "web",
			"start_char": 81,
			"end_char": 84,
			"shadowed": true,
			"shadows": []
		}
	]
},

And info dumping in frontend

Better error handling on db-related API handlers
Phrase database
Implement a Trie library
Implement phrase fitting algorithms
Automatic TypeScript type generation based on Rust structs
Svelte behind the scene
- type annotations
- token reference by SentenceConstituent, not text string

Phrase fitting (see Influx Dev Log - Phase I

)

Phrases:

{
	[1, 2, 3], 
	[1, 2, 3, 4], 
	[1, 2, 3, 4, 5],
	[6, 7],
	[7, 8, 9]
}

Greedy result, less optimal:

[[1, 2, 3, 4, 5], [6, 7], [8], [9]]

Best fit result:

[[1, 2, 3, 4, 5], [6], [7, 8, 9]]

Better create/update handling - sync id to front-end and change action type accordingly

Let's write    some    SUpeR   nasty text, 

	tabs haha

shall we?

Parsed as:

&sentences = [
    Sentence {
        id: 0,
        text: "Let's write    some    SUpeR   nasty text,",
        start_char: 0,
        end_char: 42,
        constituents: [
            CompositToken {
                sentence_id: 0,
                ids: [
                    1,
                    2,
                ],
                text: "Let's",
                start_char: 0,
                end_char: 5,
            },
            SubwordToken {
                sentence_id: 0,
                id: 1,
                text: "Let",
                lemma: "let",
            },
            SubwordToken {
                sentence_id: 0,
                id: 2,
                text: "'s",
                lemma: "'s",
            },
            Whitespace {
                text: " ",
                start_char: 5,
                end_char: 6,
            },
            SingleToken {
                sentence_id: 0,
                id: 3,
                text: "write",
                lemma: "write",
                start_char: 6,
                end_char: 11,
            },
            Whitespace {
                text: "    ",
                start_char: 11,
                end_char: 15,
            },
            SingleToken {
                sentence_id: 0,
                id: 4,
                text: "some",
                lemma: "some",
                start_char: 15,
                end_char: 19,
            },
            Whitespace {
                text: "    ",
                start_char: 19,
                end_char: 23,
            },
            SingleToken {
                sentence_id: 0,
                id: 5,
                text: "SUpeR",
                lemma: "SUpeR",
                start_char: 23,
                end_char: 28,
            },
            Whitespace {
                text: "   ",
                start_char: 28,
                end_char: 31,
            },
            SingleToken {
                sentence_id: 0,
                id: 6,
                text: "nasty",
                lemma: "nasty",
                start_char: 31,
                end_char: 36,
            },
            Whitespace {
                text: " ",
                start_char: 36,
                end_char: 37,
            },
            SingleToken {
                sentence_id: 0,
                id: 7,
                text: "text",
                lemma: "text",
                start_char: 37,
                end_char: 41,
            },
            SingleToken {
                sentence_id: 0,
                id: 8,
                text: ",",
                lemma: ",",
                start_char: 41,
                end_char: 42,
            },
        ],
    },
    Whitespace {
        text: " \n\n\t",
        start_char: 42,
        end_char: 46,
    },
    Sentence {
        id: 1,
        text: "tabs haha",
        start_char: 46,
        end_char: 55,
        constituents: [
            SingleToken {
                sentence_id: 1,
                id: 1,
                text: "tabs",
                lemma: "tabs",
                start_char: 46,
                end_char: 50,
            },
            Whitespace {
                text: " ",
                start_char: 50,
                end_char: 51,
            },
            SingleToken {
                sentence_id: 1,
                id: 2,
                text: "haha",
                lemma: "haha",
                start_char: 51,
                end_char: 55,
            },
        ],
    },
    Whitespace {
        text: "\n\n",
        start_char: 55,
        end_char: 57,
    },
    Sentence {
        id: 2,
        text: "shall we?",
        start_char: 57,
        end_char: 66,
        constituents: [
            SingleToken {
                sentence_id: 2,
                id: 1,
                text: "shall",
                lemma: "shall",
                start_char: 57,
                end_char: 62,
            },
            Whitespace {
                text: " ",
                start_char: 62,
                end_char: 63,
            },
            SingleToken {
                sentence_id: 2,
                id: 2,
                text: "we",
                lemma: "we",
                start_char: 63,
                end_char: 65,
            },
            SingleToken {
                sentence_id: 2,
                id: 3,
                text: "?",
                lemma: "?",
                start_char: 65,
                end_char: 66,
            },
        ],
    },
]

With much better tokenization + sentence segmentation + whitespace handling!

CleanShot 2023-12-23 at 20.06.00.jpg

H2 2023-12-22

Text reader skeleton with working database read and write

CleanShot 2023-12-23 at 09.45.01.jpg

H2 2023-12-21

Content directory listing works!

CleanShot 2023-12-23 at 09.42.07.jpg

zettelkasten

Influx Continuous Dev Log

Towards an Integrated Content-Based Language Learning Environment: An Exploratory Proposal

H2 2025-05-25

H2 2024-06-23

H2 2024-06-08

H2 2024-05-13

H2 2024-05-12

H2 2024-05-11

H2 2024-01-13

H2 2024-01-11

H2 2024-01-10

H2 2024-01-09

H2 2024-01-08

H2 2024-01-07

H2 2024-01-06

H2 2024-01-05

Influx Dev Log - Phase I

H2 2024-01-04

H2 2023-12-30

H2 2023-12-29

H2 2023-12-27

H2 2023-12-24

H2 2023-12-23

H2 2023-12-22

H2 2023-12-21

Backlinks

Towards an Integrated Content-Based Language Learning Environment: An Exploratory Proposal

Influx Dev Log - Phase I

Influx - Phase 1

Projects

Local Graph