jEdit bugs: Getting off the Ground with Kinto and Kinto.js

This is a guest post by Eric Le Lay on his experience using Kinto.

Introduction

Have you heard of jEdit — the programmer's text editor? It's a multi-platform text editor, written in Java, with a myriad of plugins. I've been part of the development team for ten years. It's hosted on Sourceforge, with the code base under Subversion.

This is a labour of love for us, contributing during our free time. For me, that's when I'm on holidays. I don't always have Internet access where I'm staying. Or it's via a short Ethernet cable in a dark and overheated room. Don't lock me out of the sunny garden! Give me everything I need off-line!

Java sources and Javadoc: install the openjdk7-doc and openjdk7-src packages;
jEdit and plugins sources: with git-svn I get the code and history;
bugs and feature requests: oops!

We have 1100 open tickets and 7600 closed. None of them are available offline. That needs to change!

Grabbing a ticket archive

Sourceforge lets project admins download an archive of their project, containing tickets and other things. A script of mine converts them to a bunch of HTML files, with a little JavaScript for navigation. Resulting HTML files are then uploaded to jedit.org/trackers where they can be browsed online or downloaded.

To save space, only open tickets are available.

Sourceforge takes minutes to generate the project archive, so I run the export at most once per week. It's not convenient for users either, because they have to download a new archive to get an up-to-date view of tickets. Even downloading the latest archive doesn't give them the freshest state, because it can be a week old at worst: lots of room for improvements!

My Goals

be able to incrementally update tickets;
better ticket search and visualization; maybe even offline modification;
I'd like to try Kinto :-).

First Steps with Kinto

I've heard about Kinto from Rémy and Alexis. As with couchdb/pouchdb, it's possible to sync the database between the server (Kinto) and the browser (kinto.js). Kinto.js stores data in IndexedDB so it should be possible to work with a biggish database.

Each user can create his own collections (think SQL tables) in Kinto. There are even finer grained permissions, document by document (think SQL rows).

To Do List

I've begun with the tutorial. It nicely introduces Kinto basics: create documents, show them, modify them, sync with a remote server. It uses Mozilla's hosted test server: one less barrier to try it out.

Tickets instead of Tasks

I've moved to a local Kinto server for my next steps: it's very easy to deploy using docker. If you plan to batch-import documents, install the PostgreSQL backend straight away (easy to install using docker-compose; just follow the instructions). The In-Memory backend has issues with parallel batch import requests (which happens when you import in a node.js script with request).

Once tickets are imported, the webapp has just to fetch them via kinto.js's sync()and display them.

Local Search with kinto.js

Searching in the browser with kinto.js is not as rich as on the server: it supports only a conjunction of key: [value1, value2, ...] parameters. Querying the id, _status, last_modified is accelerated by an index (code here). All documents are then loaded and returned in an array, then filtered according to remaining query params. Arbitrary queries are done by the application on the result of the list() operation. This is fine for small collections, but searching is not lighting fast when you reach 8k documents.

An example of full-text search (from a keyword in a text field):

function filterByKeyword(keyword){
  return function(doc){
    return doc.summary.indexOf(keyword) >= 0;
  };
}

function search(){
  var keyword = document.getElementById("query");
  return tickets
    .list()
    .then(function(res){
      return res.data.filter(filterByKeyword(keyword));
    });
}

// on submit
search().then(render);

Architecture

I'm hosting my Kinto server on a C1 server at scaleway. It contains 2 collections:

tickets, for open or recent tickets;
oldTickets, for tickets closed before 2016.

I use Sourceforge's ticket id instead of an auto-generated id. This way I can update the database more easily (no need to fetch the kinto document before updating it, since I know the content and the id).

Clients are web applications.

An updater script forwards modifications from Sourceforge to the Kinto database.

The C1 machine has an ARM processor and runs Debian Jessie. So a few deviations to the standard setup were necessary (see the wiki kinto).

The kinto.js Client

My goal was to learn kinto.js, not your latest front-end framework, so I quite liked the vanilla take on the tutorial and stuck with it:

bootstrap CSS, fontawesome icons,
a few utilities: moment, lodash,
datatables (after devising my own row management, but not wanting to implement row details myself and having found a nice example for this).

screenshot of the application

Initial State

When you load the application for the first time a panel shows you ticket counts, lets you choose to include old tickets and download the database.

Here is how I determine if it's the first run:

tickets.db.getLastModified().then(function(res) {
    if (res) {
        hasDB();
    } else {
        noDB();
    }
});

To display the server's ticket count before sync, one has to go down to the kinto-http package (this is being worked on, see this issue).

code snippet:

function fetchTotalRecords(api, bucket, name, etag) {
    var headers = {};
    headers["Authorization"] = "Basic " + btoa("token:tr0ub4d@ur");
    return api.client.execute({
        path: "/buckets/" + bucket + "/collections/" + name + "/records" + (etag ? ("?_since=" + etag) : ""),
        method: "HEAD",
        headers
    }, {
        raw: true
    }).then(function(res) {
        console.log("RES", res);
        return parseInt(res.headers.get("total-records"));
    });
}

tickets.db.getLastModified().then(function(lastMod) {
    var api = tickets.api.bucket(tickets.bucket).collection(tickets.name);
    // you'll soon be able to replace it with api.getLastModified(lastMod)
    return fetchTotalRecords(api, tickets.bucket, tickets.name, lastMod);
}).then(renderTicketCount);

This method also works to find out if updated tickets exist on the server ( via the ETag). The kinto server doesn't respond instantly as it should. Maybe I should also put the ETag in an If-None-Match header to speed things up.

Searching for Tickets

I've used lunr to implement client-side full text search. It builds a Lucene-like index and then lets one query this index. The index is serialized and saved as a document in a local Kinto.js collection to be reused on next page load.

Searching the index returns an array of document ids, from which I load documents before filtering the resulting (usually small) array with remaining query params. This is performing better than the first version of the application, when I first loaded all documents by calling tickets.list().

To search for hello in the lunr index:

> index.search("hello")
   [{"ref":"52a4d8cb27184667e3400f42","score":0.019459983366287688},
    {"ref":"5690ff3f24b0d96f08df7d6a","score":0.013994677485991343},
    ...
   ]

To get a { data: ['array', 'of', 'tickets', 'containing', 'myText'] } Promise:

return Promise.all(index.search(myText).map(function(hit) {
    return tickets.get(hit.ref);
})).then(function(allRes) {
    return {
        data: allRes.map(function(res) {
            return res.data;
        })
    };
});

Matching documents are fetched from IndexedDB in parallel via their Id (collection.get()). It's fast because IndexedDB is optimised for key based access, but it could be more efficient to use tickets.list({filters: {id: ['array', 'of', 'ids']}}) to curb request count.

Displaying a Ticket

My application is meant to be multi-tab, browser history and bookmarks compatible. To achieve that, the UI state is stored in the URL and interactions result in calls to pushState().

var state;

function encodeState(state) {
    return "#" + encodeURIComponent(JSON.stringify(state));
}

function applyState(state) {
    document.getElementById("search").value = state.filter;
    search();
}

function saveState(state) {
    window.history.pushState(state, "jEdit Trackers", encodeState(state));
}

function hashIsTicket() {
    return window.location.hash.match(/^#[a-z0-9-]+$/);
}

window.addEventListener("popstate", function(e) {
    // called on browser history navigation
    if (e.state) {
        state = e.state;
        applyState(state);
    } else {
        if (hashIsTicket()) {
            // show this ticket
        } else {
            // show search interface
        }
    }
});

A ticket can be displayed from the result list by clicking Open. Nothing fancy: the id is put in the URL's hash and the application parses the hash at startup.

function restoreInitialState() {
    if (window.location.hash.match(/^#[a-z0-9-]+$/)) {
        showTicket(window.location.hash.substring(1));
    } else if (window.location.hash.startsWith("#%7B")) {
        try {
            state = JSON.parse(decodeURIComponent(window.location.hash.substring(1)));
        } catch (e) {
            console.log("invalid state in hash", window.location.hash);
            state = defaultState();
        }
        applyState(state);
    } else {
        state = defaultState();
        applyState(state);
    }
}

restoreInitialState(); // to restore state from bookmark or link at page load

Updates

Documents have to be updated in Kinto whenever a ticket is modified at Sourceforge's trackers. I do this via a server-side program.

updater.py

There is no way to get a real-time modification stream from Sourceforge.

A few ideas to work around that:

poll each tracker's RSS feed (limited to 10 tickets so one can miss updates if the script lags at some point);
poll each tracker's REST API, asking for tickets modified since last time (delay < polling period);
subscribe to the jedit-devel mailing list and fetch mails with a pop3 client (delay < 1 minute).

I've chosen solution 2. The updater.py script:

fetches the trackers list from Sourceforge,
fetches from kinto the last update date (a document in the updater collection),
queries Sourceforge for tickets modified since this date,
pushes updates to the tickets and oldTickets collections.

Tickets are returned by Sourceforge in JSON format. There is only minimal mapping to do (fetch comments via a second request to Sourceforge).

Client Notifications

Kinto seems to provide websockets notifications but I've not looked into it.

Clients are instead polling the kinto server for new tickets every minute. When there are, a little would you like to sync? notification appears on the top-right of the page.

Following a synchronization, I update the lunr index with modified/added/deleted documents (index.add(document), index.remove(document)).

Counters displayed on buttons (missed, open bugs) are then recomputed and stored in a client collection. When there are no deletions I can update them incrementally, otherwise I have to go through the whole collection to recompute them.

Conclusion

Was Kinto Worth a Try?

Definitly yes. It's a sound platform and a good fit for my application. However it's not an out-of-box experience like meteor. One has to combine it with other libraries to create a full application. On the other hand, it makes deployment and debugging simpler.

Performance is fine (batch importing the 8k tickets keeps my little server busy for 5 minutes) and the system is open.

I made things a bit complicated for myself by splitting tickets into 2 collections, but not loading the old tickets is a huge performance boost.

To Do List (not the tutorial)

I've spent 3 weeks exploring Kinto and still have things to try out. I'm also far from a full-fledged application.

Some leads:

Modifying Tickets I'd like to be able to modify tickets from the application, instead of having to use the Sourceforge UI. Modifications could be possible offline, kept in kinto.js until network connectivity is restored.

Nothing prevents it but it takes a lot of development to make a nice UI. I'll first look into FormBuilder before rolling out my own. If actual interaction with trackers is too much, maybe letting users tag, star, follow tickets for themselves could be useful (stored in kinto to sync between their devices).

User Interface The UI could be improved greatly. I also plan to refactor the code to take advantage of react, since there exists some boilerplate and kinto-admin for inspiration.

Attachments Tickets are available offline, but not attachments (attachments like the log with the stack trace I need to fix a bug). So no 100% offline experience yet! I'd like to distribute these attachments, maybe using Kinto Attachments. Attachments are a total of 10MB for fresh tickets, plus 18MB for old ones: it seems feasible.

Full Text Search I make very basic use of lunr.js: indexing the title and description fields. But the index is already quite big and slow to (de)serialize from IndexedDB. I wonder if I should enrich it, allowing more powerful queries or define my own query language and use indexes on my side.

Indexing should take place in a WebWorker, to not stall the application.

Performances Tickets could be indexed by number, to speed up this kind of query (ids are unique in a project, numbers are unique in a tracker). Also, indexing on other predefined search criteria could speed things up.

Manually managing indexes would be feasible (a document per index, mapping each value of the field to an array of document ids). Then, looking for all open bugs would look like:

indexes.get("bug").then(resInd){
    var ind = resInd.data; // ind = { open: ["id1", "id2", ...], ...}
    return Promise.all((ind.open).map(tickets.get.bind(tickets)))
                  .then(function(multiRes){
                      return multiRes.map(function(res){ return res.data;})
                  });
});

Thank You!

I'd like to thank the Kinto team for their kindness and their openness: reachable on IRC (#kinto) and Slack, they encourage issues and PRs. Many of which are even accepted ;-)

My code is available on github: elelay/jedit-trackers-kinto. The application is available at trackers.elelay.fr. Feedback is welcome via IRC (elelay) — why not even Issues and PR?

Revenir au début