a thoughtful web.
Good ideas and conversation. No ads, no tracking.   Login or Take a Tour!
comment by mk
mk  ·  3520 days ago  ·  link  ·    ·  parent  ·  post: Pubski: April 8, 2015

It would be better if we were using SQL. What we have is less efficient. Hubski data is stored in files, and much of the search functionality accesses only those that are currently loaded. (If you want to go deep, you have to filter by a user or a tag. i.e. https://hubski.com/query?id=tag:space%20mars)To make things a bit faster, text is processed upon submission/editing to remove dupes and to cut out conjunctions, prepositions and pronouns, and that modified text is searched.

It's very interesting. There are a number of ways that we could make search faster and better, and improving our search.arc would be a fun project. If we didn't have a working plan to replace it, I would.

You might have noticed that the rewrite is taking a long time. Our initial approach was problematic. A complete frontend/backend rewrite was a lot to chew upon, and it was becoming an enormous energy sink with a finish line that wouldn't stay in one place. As a result, we decided to take a different approach. We are now getting hubski.arc to read/write to a relational database (PostgreSQL). That accomplished, step one is to put a proper search app in place that is well matched for it (Elasticsearch). The Hubski app will benefit as well, as some functionality will be able to take advantage of the new data structure. We are then building out an API for that database, and finally, we will replace hubski.arc, mostly likely with the node.js app that forwardslash has largely built. That's the general plan. It's one where we can see the finish lines, and all of us will feel incremental improvements as we cross them. Also, we should never have to roll out a less-functional Hubski at any point, which our original approach most certainly would have.