programming

QSqlQueryModel and QTreeView - part 1

Submitted by mimec on 2011-06-19

Qt provides an easy way of populating a table view using an SQL query by using the QSqlQueryModel. It can also be used with a QTreeView, but only for displaying flat lists. If we want to display a hierarchy of items based on a set of SQL queries, we have to subclass the QAbstractItemModel on our own, which is not an easy task.

There are many ways in which hierarchies of items can be created and it's probably difficult to abstract it and create an universal solution. The first category is a "homogeneous" tree, in which all items represent the same class of entities, and which could be populated using a single query with Id and ParentId columns. Items whose ParentId is NULL represent top-level items, and items with given ParentId are children of the corresponding parent item. An example would be a tree of categories, which can have sub-categories, sub-sub-categories, etc.

Another category is a "heterogeneous" tree, where each level of the hierarchy corresponds to a different class of entities (i.e. a different table). That's the case which I faced in WebIssues. One of the trees contains projects, which can have child folders, and folders in turn can have child alerts. Another tree contains issue types and their corresponding attributes. There can also be various combinations of these two categories, and more complex scenarios, but they usually can be derived in one way or another from the "heterogeneous" solution, so I will focus only on that case.

To create a tree model based on multiple related SQL queries, we need the following components:

  • A subclass of QAbstractItemModel providing an implementation of all its abstract methods
  • A collection of QSqlQueryModels, each of which contains items at a different level of hierarchy
  • Some internal data structures used for tracking dependencies between items

For now I will skip the implementation of the model's abstract methods, as this is a topic for a separate post. Let's assume that we want to represent the following simple tree of items:

+ Project 1
| + Folder 1
| | + Alert 1
| | + Alert 2
| + Folder 2
+ Project 2
  + Folder 3

The first level of the hierarchy will be populated using an SQL query returning identifiers and names of projects, for example:

+-----------+-----------+
| ProjectId | Name      |
+-----------+-----------+
|         1 | Project 1 |
+-----------+-----------+
|         2 | Project 2 |
+-----------+-----------+

The second level of hierarchy will be based on the following query results:

+----------+-----------+----------+
| FolderId | ProjectId | Name     |
+----------+-----------+----------+
|        1 |         1 | Folder 1 |
+----------+-----------+----------+
|        2 |         1 | Folder 2 |
+----------+-----------+----------+
|        3 |         2 | Folder 3 |
+----------+-----------+----------+

As you can see, the first two folders belong to the first project, and the third folder belongs to the second project, just like on the first diagram. A similar query, this time with AlertId, FolderId and Name columns, can be used to populate alerts, i.e. the third level of hierarchy, and this could be continued to create even more levels.

We assumed that the first column is always the primary key, and the second column is the foreign key related to the parent table (except for the first, root level). So these columns constitute the structure of the tree. Subsequent columns contain data which is displayed in the tree view (there can be more columns than just the name, but that's also a topic for another post).

Now let's get to the internal structures which are needed to keep track of items and relations between them. As you know, each item, or rather cell, of the Qt data model, is represented by an index. The index consists of a row index, column index and an optional parent index. In case of a flat list or table, there are no parent/child relations between items, so the row and column indexes are sufficient to describe a cell. In a tree model, the index must also contain a pointer to some internal data structure, which is necessary to determine the parent index and to keep track of the child items.

At the minimum, the internal data structure (let's call it a "node" for simplicity) should store the level of hierarchy, at which it is located, and the index of the parent item in the parent node. Note that the node does not correspond to a single item in the model, but rather to a group of items with the same parent.

For example, in the model shown on the first diagram, there would be four nodes. The first node, or the root node, represents all top level items, i.e. all projects. It has a level of 0 and the parent index is irrelevant since it has no parent. The second node represents two folders which belong to Project 1, with a level of 1 and the parent index of 0 (i.e. the index of Project 1 in the first SQL model). The third node represents the single folder belonging to Project 2 (with a level of 1 and parent index of 1). The fourth node represents the alerts belonging to Folder 1 (with a level of 2 and parent index of 0).

Obviously the model needs to keep track of the node structures, so at each level of the tree there should be some kind of a container for its nodes. Also, to make navigation around the tree easier and more efficient, we need some additional data structures serving as sort of indexes. We need to be able to find both parent and child indexes of a given index, and to map indexes of our tree model to indexes of the associated QSqlQueryModels and vice versa, and it's not as easy as it seems.

In the next post I will describe some of the nuances of implementing the model itself.

Cost of constructing an empty QSslCertificate

Submitted by mimec on 2011-05-11

Recently I've been trying to solve a problem in WebIssues which manifests itself as a short delay just before connecting to the server. The UI just freezes for about two seconds. It seemed strange because all network operations are asynchronous so there shouldn't be any delay. A few experiments involving the new QElapsedTimer class revealed that the entire time was spent in the constructor of the CommandManager, i.e. the class responsible for communication with the server.

At the first glance the constructor simply creates a QNetworkAccessManager object and connects to a few signals. But the construction of the network access manager only takes several milliseconds, so where's rest of the time spent? What is easy to forget is that a constructor of an object implicitly calls the default constructors of all its fields, if they have such constructors. Two such members caught my attention: a QSslCertificate and a QSslConfiguration fields which are used when connecting to the server via the HTTPS protocol.

Debugging through the code I found that the default constructor of QSslConfiguration does nothing interesting, but the constructor of QSslCertificate always calls the following function, even when an empty certificate is being constructed:

QSslSocketPrivate::ensureInitialized();

This innocent looking function loads the OpenSSL libraries, initializes all encryption algorithms, seeds the random generator and loads default ciphers and certificates. No wonder it takes whole two seconds! We should keep this in mind and avoid creating an instance of the certificate until it's really needed; it doesn't make sense to load the OpenSSL libraries if we're not using a secure connection at all.

But why does the command manager need a certificate field? It's because of the way it handles the sslErrors signal of the network access manager. Instead of simply cancelling the connection, a dialog box is displayed containing a list of errors and a button which shows the details of the certificate. The user can cancel the connection or ignore the errors and proceed. However if another command is sent to the server after some time, the connection may be closed and it needs to be reestablished, resulting in the sslErrors signal being triggered again.

Obviously we don't want to constantly bother the user with the same error message, that's why the command manager stores the certificate once it's presented to the user and all further errors related to that certificate are silently ignored.

The solution was simple: I replaced the QSslCertificate with QList<QSslCertificate>. The list is initially empty, so no instance of the certificate is created when the command manager is constructed. An additional benefit is that the program remembers all ignored certificates, not just the most recent one.

Obviously the delay will still occur at some point if the connection is secure, but that will happen while processing asynchronous events and the UI will be refreshed by that time, so there's a good chance that the user won't even notice the delay.

Debugging the SQLite driver

Submitted by mimec on 2011-05-09

Some time ago I wrote about creating a custom collation for SQLite, integrating SQLite in a Qt application and creating a custom SQLite driver. Although it's a bit of inconvenience, there are also additional benefits of having a custom SQLite driver, for example better support for debugging.

Tracing executed queries is very useful for debugging complex applications because it gives an idea what's happening during its execution without need to debug it line by line. Most server based database systems include tracing functionalities, but in case of a server-less database such as SQLite the only option is to build in tracing into the client.

There are at least two ways of doing that. The simplest one is to modify the SQLiteDriver::prepare method and include code which prints out the query:

qDebug() << "SQLite:" << query;

The disadvantage of that solution is that if the query contains parameters, their values will not be printed. Furthermore, if the query is prepared once and executed multiple times with various parameters, it will still be printed only once.

Luckily SQLite gives us an API function which can be used to trace executed queries. First we need to define a callback function:

static void trace( void* /*arg*/, const char* query )
{
    qDebug() << "SQLite:" << QString::fromUtf8( query );
}

Now we can register the function using the following code:

sqlite3_trace( db, trace, NULL );

Voilà! All executed queries are printed and SQLite automatically replaces the parameter placeholders with appropriate values.

Another thing that is very useful for debugging is information about errors. Printing out all error messages makes debugging a lot easier, especially that it's quite difficult to handle QSqlQuery errors properly in all places.

SQLite does not have a callback for reporting errors, but QSqlResult, which is a base class of the SQLiteResult has a virtual method called setLastError. It is called with appropriate error details whenever some SQLite function return an error. Normally it only stores the error so it is available by using the lastError mehod. All we need to do is override it so that is prints out the error:

void SQLiteResult::setLastError(const QSqlError& e)
{
    if (e.isValid())
        qDebug() << "SQLite error:" << e.driverText() << e.databaseText();
    SqlCachedResult::setLastError(e);
}

The driverText is the description of the operation that was performed and databaseText is the error message returned by SQLite.

When the error occurs while preparing the statement, for example because the query has invalid syntax, the trace method is not called and no information about the query that failed is printed. The simplest solution is to manually call the trace method when sqlite3_prepare16 returns an error:

trace( NULL, query.toUtf8().constData() );

The QSqlDriver class also has a virtual setLastError method which can be overloaded is order to report errors related to opening the database and handling transactions.

Tracing should obviously be disabled in release builds; it's generally a good idea to disable it by default and only enable it conditionally, for example when a special preprocessor macro is defined.

SQLite: UTF-8 or UTF-16?

Submitted by mimec on 2011-04-22

SQLite stores all text as Unicode characters encoded using either UTF-8 or UTF-16 encoding. The API contains functions which allow passing and retrieving text using either UTF-8 or UTF-16 encoding. These functions can be freely mixed and proper conversions are performed transparently when necessary.

The SQLite driver for Qt uses the UTF-16 version of most functions, because that encoding is used internally by the QString class. However SQLite uses the default UTF-8 encoding internally, so it needs to convert all text back and forth when reading and writing data. Usually such conversions does not affect performance very much. The benefit of using UTF-8 encoding internally is that it requires less storage than UTF-16. According to this email from the SQLite mailing list, the performance gain resulting from reduced size of the database is much bigger than the cost of the conversion.

However there is one case when the cost of conversion may become significant. In the previous post I showed how to implement custom collation based on QString::localeAwareCompare. I mentioned that the function expects text to be encoded using UTF-16 encoding because that allows passing it directly to QString. The use of fromRawData ensures that the data is not even copied to a separate buffer. However SQLite still needs to copy both compared strings to a temporary buffer and convert them from UTF-8 to UTF-16 and this happens every time the strings are compared, which is O(N·logN) when sorting query results by a text column.

I made a little experiment to see which storage format is better in such case and the results are a bit surprising. It turns out that UTF-16 is 4% faster than UTF-8 when sorting 1000 rows and 7% faster when sorting 10,000 rows. However UTF-8 becomes faster with large amounts of data - it is 27% faster than UTF-16 when sorting 100,000 rows. At this point the size of the database (which is 4.3 MB for UTF-8 and 5.7 MB for UTF-16) becomes more significant than the cost of conversion.

So the answer to the question which encoding to use is, as usual, "it depends". In my case the database will be rather small and there will rarely be more than a few thousand rows per query, so I will use UTF-16 encoding to avoid the conversions. There are two ways to achieve that. We can execute the following statement directly after opening the database, before any tables are created:

database.exec( "PRAGMA encoding = \"UTF-16\"" );

Another way is to modify the SQLite driver (since we're already using a copy of it) and use sqlite3_open16 instead of sqlite3_open_v2 to open the database - it enables UTF-16 encoding automatically. In that case we won't be able to open the database in read only mode, but if we're not using the advanced connection options then it's not a problem.

Qt, SQLite and locale aware collation

Submitted by mimec on 2011-04-17

One of the principles of the WebIssues Client is that it stores a local cache of data retrieved from the server and performs many operations (including sorting, filtering, etc.) locally. This saves a lot of network bandwidth and server workload, but it also makes implementing the client more difficult. So far all data has been simply stored in memory using multi-dimensional hash tables (available as RDB classes) and that solution has been working fine for hundreds or even thousands of issues, but it is not scalable for larger amounts of data.

I've been trying to find a solution where unused data is saved to disk and loaded on demand, but it would be difficult to implement it efficiently and reliably. Finally I decided that instead of trying to reinvent the wheel, I should use a technology which solves all these problems and works fine for many applications, including two major web browsers, Firefox and Chrome. I am currently migrating the entire data storage of WebIssues Client to use the SQLite database engine.

Qt comes with built-in support for SQLite which seems enough for most purposes. However after some investigation it turned out that SQLite lacks some important features that can be easily implemented by extending it through its API. One of such features is support for a case sensitive, locale aware collation supporting Unicode characters, which is required by WebIssues to correctly sort string using accented and non-latin characters.

In order to create such collation we need to write a simple function which takes advantage of built-in Unicode support in the Qt framework:

static int localeCompare( void* /*arg*/, int len1, const void* data1,
    int len2, const void* data2 )
{
    QString string1 = QString::fromRawData( reinterpret_cast<const QChar*>( data1 ),
        len1 / sizeof( QChar ) );
    QString string2 = QString::fromRawData( reinterpret_cast<const QChar*>( data2 ),
        len2 / sizeof( QChar ) );

    return QString::localeAwareCompare( string1, string2 );
}

The use of fromRawData ensures that the string is not copied in memory and makes the function very efficient. In order to register the collation we need to call the following function:

sqlite3_create_collation( db, "LOCALE", SQLITE_UTF16, NULL, &localeCompare );

The first three parameters are: the pointer to the sqlite3 structure which is essentially the handle of the database, the name of the collation and the format of text which is passed to the compare function - we ensure that the text is converted to UTF-16 encoding which is expected by QString. The handle of the database can be retrieved from the QSqlDriver object:

QVariant handle = driver->handle();
sqlite3* db = *static_cast<sqlite3* const*>( handle.data() );

We might think that this is enough for our custom collation to work. However, depending of the configuration of the Qt libraries, we may get an error saying that the sqlite3_create_collation function does not exist.

There are at least three possible ways of how Qt can be integrated with SQLite:

  • Qt may use the shared SQLite library provided by the operating system. This can be achieved by configuring Qt with the -system-sqlite switch and it is often how Qt libraries which are part of Linux distributions are configured. We can simply include the system sqlite3.h header and link to the libsqlite3.so library and we're done.
  • Qt may include it's own copy of SQLite compiled as part of the QtSql library. It is the default configuration for static builds of Qt. When our program is statically linked to QtSql it can access the SQLite functions, but we still need to include the copy of the sqlite3.h file with our program or declare the functions manually.
  • Qt may also include it's own copy of SQLite compiled as part of the qsqlite plugin, which is the usual configuration when Qt is compiled as shared libraries. In that case our program may not be able to access the SQlite functions at all (on Windows, functions are not exported from a DLL unless explicitly declared in code). The only solution in that case is to include our own copy of SQLite in our program.

When writing a cross-platform program which does not rely on Qt being built in any particular configuration, our only solution is to handle the third, most generic case. Fortunately the authors of SQLite make it easy to integrate it with any application by publishing the source code "amalgamated" into one source file and one header and the only disadvantage of such solution is the large size of the SQLite code.

When we include our own copy of SQLite, the program will compile fine, but it still may crash when started. The reason is that the QSQLiteDriver, which is part of Qt, will use its own copy of SQLite and our program will refer to a different copy. Each copy will use a separate set of global variables which are needed for synchronization and our copy will try to access some uninitialized objects resulting in the runtime error.

Unfortunately to make this work we need to not only use our own copy of SQLite but also our own copy of the QSQLiteDriver. We should copy the files containing QSQLiteDriver (which can be found in src/sql/drivers/sqlite folder of Qt sources) and also the sources of the internal QSqlCachedResult class (from src/sql/kernel) which is its base class to our program and change the names of these classes by removing the 'Q' prefix to prevent potential conflicts. Then, instead of loading the default driver, we should instantiate our copy using the following code:

SQLiteDriver* driver = new SQLiteDriver();
QSqlDatabase database = QSqlDatabase::addDatabase( driver );

The whole process of extending SQLite would be much easier if the standard QSQLiteDriver provided a pointer to the sqlite3_api_routines structure which includes pointers to all SQLite functions, regardless of which version it is and how it is linked. I suggested this to Nokia and perhaps it will be implemented in a future version of Qt, but until then we have to use tricks such as the one I described above to make it work.

Note that on systems where SQLite is available as a shared library, we can provide an option to link our program with that library instead of using the internal copy of SQLite. This way our program will be slightly smaller on those systems.