Tuning SQLite Performance

SQLite is an extremely fast database engine, but it can get slow quickly if you scale the the amount of data without making adjustments. Here are the main things to keep in mind:

Transactions
If you're processing more than one row of data, setup a transaction first. The performance difference is enormous. In the DataCrux (now at datacrux.com!) switchover to SQLite 3, the slight change to the query format for setting up a transaction had gone unnoticed by me, and things were taking an insanely long amount of time. For example, in one DataCrux app currently in beta, a data import took about 4 minutes without a transaction. With the transaction code fixed, the same task was completed in less than a second!

Virtual Machines
It takes some slight adjustments to how you go about writing your code, but using virtual machines (sqlite3_stmt * structs) can have a significant impact on speed. SQLite doesn't have to spend time compiling entire query strings. It just reads the input values and applies the changes. DataCrux now dynamically generates these each time a  database file is opened.

Indexes
Indexes are created on columns which are frequently used in 'where' clauses. This speeds up selects, but it can significantly slow down inserts and updates. DataCrux currently does not automatically create indexes for this reason. In any case, I believe the biggest speed hit on retrieving data is re-instantiating the objects and setting their values, not finding the data in sqlite. If you do a lot more reading of data than writing, creating indexes clearly makes sense.

Sync Mode
This is perhaps the single biggest factor in how fast SQLite performs. Transactions can save you from a lot of the overhead of using individual file operations for each query statement, but it's not always possible to setup a transaction. In such a case, you may want to consider sending this query string to the database:

PRAGMA synchronous = OFF;

The performance difference can be dramatic. The tradeoff is that if a kernel panic or power loss occurs before the data reaches the disk, it's possible for the database file to become corrupt. Though, according the docs, application crashes pose no threat.
Design Element
Tuning SQLite Performance
Posted Jan 5, 2005 — 6 comments below




 

Oskar — Jan 05, 05 51

I'm quite interested in where you will take DataCrux when Tiger is out. Isn't it supposed to include an SQLite server in the Cocoa application framework? Think it's called CoreData. Supposed to be very fast.

Carla — Jan 05, 05 52

I'm just stopping to say hi since it's de-lurker day. I really love your design.

Scott Stevenson — Jan 05, 05 53 Scotty the Leopard

Oskar: For the most part, it doesn't seem to make a lot of sense to position DataCrux as an alternative to Core Data for Tiger applications. However, there are a whole lot of people that are writing now or in the near future that are not prepared to require Tiger for at least a year after it's out. For that matter, there are DataCrux users that want me to maintain 10.2 compatibility!

That said, there's one particularly interest aspect of DataCrux that Apple hasn't talked about for Core Data. That feature isn't public knowledge yet but is already in production use at a few places.

Scott Stevenson — Jan 05, 05 54 Scotty the Leopard

Thanks for the compliment, Carla.

Oskar — Jan 06, 05 55

Very valid points there, Scott. Sometimes I think Apple is a bit too rash in putting out new updates and forcing users to update, and it's an honorable thing for you to preserve functionality for those who don't.

De-lurker day?

Scott Stevenson — Jan 06, 05 56 Scotty the Leopard

To me, the issue isn't even so much forcing users to upgrade. The reality is that most people don't look into OS upgrades unless they're heavily marketed in mainstream media. And even if they're interested, they may not want to spend the money or time to install it.

Whether all of that matters to you depends on your market. The more mainstream the target audience, the more you typically have to consider platform versions. Though it's also worth balancing that with development time. If a developer uses API specific to a certain version of the OS, he may still see a net benefit even with a reduced audience if the API saves a significant amount of time or enables certain functionality.




 

Comments Temporarily Disabled

I had to temporarily disable comments due to spam. I'll re-enable them soon.





Copyright © Scott Stevenson 2004-2015