a thoughtful web.
Good ideas and conversation. No ads, no tracking.   Login or Take a Tour!
comment by paxprose
paxprose  ·  3387 days ago  ·  link  ·    ·  parent  ·  post: Hubski development progress update: internal api good, Arc bad, external api sooner.

I wish there were two of you so one could keep an adequate log of all the shit you have to do throughout your conversion process. I'm sad to have missed your livestream.

Enhancing the performance of your queries is a lot of guesswork without the right tools. I need to hop in feet first in some PostgreSQL but that sweet Cassandra is calling my name.

Does the Arc SQL adapter support the use of stored procedures? Have you tried wrapping a view around one of the bulkier tables to filter the results a little and maybe weed out some unnecessary/problem columns? Is your dataset normalized?

I know a bunch of questions isn't going to help; but holy snark I love me some DB discussion.





rob05c  ·  3387 days ago  ·  link  ·  

    Is your dataset normalized?

It's BCNF, if you consider processed data unique, and consider null a value. But it stores processed data, which ought to be computed at runtime. For example, text, md, searchtext. The application has to be changed to fix that. That will come later.

    weed out some unnecessary/problem columns?

There are no unnecessary columns at this point. We just converted the data to SQL. I did try removing the massive searchtext table from the query. Didn't help.

    Does the Arc SQL adapter support the use of stored procedures?

No, and I will avoid them, unless absolutely necessary for performance. Code should be in the application.

Right now, it's looking like it just needs to do bulk loading. It's not the network. When the app starts, we load ~250k publications. With 13 tables, that's 3 million queries. Bulk loading makes it 14 queries. Even for several hundred megabytes of data, that many queries far outweighs the cost of the data itself. I've faced this problem professionally before, and seen a 20× speedup in a similar situation.

It also took me 3 months to do a large OOP system. This will not take 3 days. LISP ≫ C++.

paxprose  ·  3387 days ago  ·  link  ·  

I can't wait to see how you finalize your approach. I have one co-worker nodding his head violently at the prospect that business logic should be handled by the application and enhanced by interfaces as needed while another swears by handling everything in the db.

I lean toward the latter; but I know nothing.

Thanks for the updates, man. Godspeed.

rob05c  ·  3387 days ago  ·  link  ·  

    another swears by handling everything in the db.

Codinghorror summarises my feelings well.

    Have you ever worked on a system where someone decreed* that all database calls must be Stored Procedures, and SQL is strictly verboten? I have

I also have, and his experience and conclusion both parallel mine.