Blog

Fuzzy optimizations in QTransform

Submitted by mimec on 2012-12-06

Before I get to the point, just a quick note: I recently released a new version of both WebIssues and Fraqtive and I'm planning to release a new version of Saladin by the end of this year, so I'm quite busy, as usual at this time of the year. Also check out the fractal animations created using the latest version of Fraqtive. There are much more impressive deep zoom animations available, but they were created using specialized tools which use high precision numbers. Fraqtive uses SSE2 to maximize the real-time experience, so it's limited to double precision, which allows to zoom the fractal about 10^13 times. But this is enough to produce some cool effects.

Now back to the main topic. For a long time there was a strange bug in Fraqtive that I thought was impossible to fix. As those who use it know, in Fraqtive it is possible to move and zoom around the fractal using the mouse, and when you release the mouse button, the new region of the fractal is recalculated. This works by converting the start and end position to a QTransform (which is basically a 3x3 matrix defining the offset, zoom and rotation) and calculating the relative transformation. It worked fine until a certain zoom level was reached; then it started producing weird results. I always blamed the limited precision of double numbers, though the effect could be observed a few orders of magnitude before reaching the maximum valid zoom.

Recently I debugged the entire code, including the calculations performed inside QTransform class and discovered that it performs some optimizations which cause the wrong results. The documentation mentions that QTransform performs some optimizations based on the type of the matrix. For example, if the transform includes translation only, the scale and rotation components are ignored. What the documentation doesn't mention, though, is that Qt uses fuzzy compare functions (such as qFuzzyCompare) to determine the type of the matrix.

The problem with qFuzzyCompare and it's undocumented qFuzzyIsNull counterpart is that they assume that only about 12 digits are significant in a double precision floating point number. This often makes sense, because limited precision can result in some subtle differences which cause the strict comparison operator to fail. We all know that 10 / 3 is 3.33333... and that value multiplied by 3 gives 9.99999.... In mathematics, this value is equal to 10, but for a computer, these numbers are not necessarily equal.

The effect of using fuzzy comparisons is that at a zoom level higher than 12, the scale factors are considered equal to 0, so the transformation is considered non-invertible, when in fact it is. Also rotation tends to be ignored at this zoom level. The solution is not to use QTransform when high precision is required or to write custom functions which do not have these side effects. See also [#QTBUG-8557] where a similar problem is discussed.

Note that fuzzy comparisons are used not only in QTransform, but also in other classes like QMatrix4x4 and QVector3D. Some time ago I came across the a similar problem with QVector3D::normalized which caused Descend to incorrectly calculate normal vectors for the surfaces. The problem is that Descend calculates three samples that are very close to one another in order to precisely determine the normal vector. In some areas of the surface it would hit the 12 significant digit limit, so I ended up writing my own version of this function which didn't use fuzzy comparisons.

Qt and ZIP files

Submitted by mimec on 2012-11-20

This is a follow up to the series of articles about serialization in Qt that I wrote a few months ago. Sometimes it's necessary to create complex files which contain lots of various information. Serializing everything into one stream of data is not always a good idea. Also sometimes it may be a good idea to compress serialized data, because it's usually quite verbose. Instead of coming up with a custom file format, the best idea is to use the solution which is implemented in may applications and document formats, including both OpenOffice and MS Office documents; that is to wrap data in a ZIP file.

There are many advantages of using the ZIP format. We can save data in multiple smaller files which are zipped together. We can include additional files and attachments, for example JPG images. Finally, we can compress selected files to make the resulting file more compact.

There are a few libraries for Qt which handle ZIP files, including OSDaB and QuaZIP. There are also the QZipReader and QZipWriter classes which are part of Qt. They are internal classes that you can find in src/gui/text subdirectory in Qt sources. There are plans to include them in official Qt API (see QTBUG-20896), but they will not make it into Qt 5.0 and it's not clear whether they will be included in later versions. However, if the license permits, you can simply copy them into your own application (but see the notes about static linking below).

A problem for Windows developers is that all these libraries and classes depend on zlib.h. This is fine on Linux, however zlib is (unfortunately) not part of Windows API. The good news is that QtCore not only includes zlib statically when built on Windows, but it also marks its functions as DLL exports. Thanks to this, the QZipReader and QZipWriter classes, which are part of QtGui, can still work on Windows. In my applications, including Descend which uses ZIP file format for its projects, I simply include the zlib.h and zconf.h files from Qt in the source package and use them when system zlib library is not available. Then I use the following simple .pri file to include either system zlib headers or the custom ones:

contains( QT_CONFIG, system-zlib ) {
  if( unix|win32-g++* ): LIBS += -lz
  else: LIBS += zdll.lib
} else {
  INCLUDEPATH += $$PWD
}

Thanks to this, #include "zlib.h" works no matter if zlib is a system library or not. When Qt is built without system-zlib (which is usually the case on Windows), it will include all the necessary exports in QtCore, so the application will link and work correctly.

Including internal Qt classes as part of the application is potentially dangerous, because there can be conflicts between the classes provided by Qt and by the application itself, especially when the application is linked with a different version of Qt. This is even more dangerous when the application is statically linked with the Qt libraries, which I usually do in Windows release builds to prevent DLL dependencies problems.

To avoid problems, I always remove the 'Q' prefix from such classes. This way they are seen as separate entities from the ones provided by Qt. However, I encountered a strange problem with Descend. In dynamically linked debug builds it worked fine, but when compiled in static release mode, it crashed when copying text to the clipboard. At first I suspected a strange bug in Qt or the compiler itself, and the problem was hard to debug because it only happened in release builds. However, after some analysis, I discovered that QPlainTextEdit copies text into the clipboard not only as plain text, but also in HTML and ODT formats. Coincidentally, creating an ODT document is exactly what Qt uses the internal QZipWriter class for...

It finally turned out that there was an innocent looking structure in qzip.cpp called FileHeader, which I slightly modified, but I didn't rename it. A plain structure would be fine, but this one had an implicit constructor and destructor, because it contained a QByteArray member. Unfortunately the linker doesn't detects such conflicts (it would not be possible anyway, because of how C++ works), but happily overrides the symbols from Qt library with identically named symbols defined in my application.

This can lead to unexpected problems not only when copying code from Qt, but also in many other situations. After all, many libraries can include a class named FileHeader. That's why static linking must be used carefully. Qt generally uses the Q prefix everywhere, even in private classes and functions, to prevent this type of problems, but this particular one didn't have it. If you ever experience mysterious crashes in release builds, this can be one of the possible reasons.

Descend 0.2 released

Submitted by mimec on 2012-10-15

The first official version of Descend was released today. You can grab it from the Descend website which was also officially launched today. It's quite a historical moment, as Descend has been waiting for about eight years to be released. You can read a short history of this program and its predecessors here. Also this is the fourth sub-website in the mimec.org family, so I decided to add some nice links on the front page.

What's coming next? As I said recently, I have plans to start working on version 1.1 of WebIssues, but before I release the first beta version, you can also expect new releases of both Saladin and... Fraqtive. Possibly by the end of this year :).

Crossroads

Submitted by mimec on 2012-10-08

It's been a while since I last wrote a post, but I've been quite busy. First of all, in a week or two I will release the first, official beta version of Descend. It will be quite an event, because I made several attempts at writing it in the last eight years and I've never come that close to finishing it. In case you missed the earlier post about Descend, it's a program for drawing 3D surfaces (and curves) based on parametric equations. It's sort of a conceptual project, so it's not going to have a lot of features, but it aims at being very fast and producing high quality graphics.

Another thing that has been bugging my mind in the past few months is the future of WebIssues. I came to the point where it simply doesn't make a lot of sense to put more effort into it without actually getting something back. I made lots of analysis how to make money from this project, especially focusing on controversies around the open core model. I came to the conclusion that the best solution is to create a specialized, commercial system based largely on WebIssues and dedicated to a narrow group of users, while keeping WebIssues itself a free, powerful, general purpose tool for bug tracking, project management, etc., which it already is, and letting it continue to evolve.

The idea is not new; I've been thinking about profiting from WebIssues since I started working on version 1.0 three years ago, but now I have a much clearer vision of what I'm trying to achieve. Most of all, I don't want to be an outsourced developer for the rest of my life. I got as far as I could in this area in terms of both allowance and career development. I need to change something sooner or later and this could be a great opportunity to do that. Of course, I could just get a highly paid job and forget about all this open source shenanigans; but in the end, what you achieve in life is a matter of much more than just money. So as always I'm choosing the harder way, and it's going to take time, but I think it's going to be worth the effort.

At the moment, however, I'm already planning to start working on version 1.1. The updated roadmap includes a long description field, roles and groups, LDAP authentication and project summary. I have an initial design for some of these features, others may be added as well, the roadmap is not closed yet. I will be posting more information on webissues.mimec.org and might even release the first beta version by the end of the year, but I'm not enforcing any deadlines on me. As long as I do it for free, it's nothing more than a hobby, and as such it competes for my time with other hobby projects, including Descend, Saladin, and especially the new update for Minecraft: Xbox 360 Edition that's coming out soon :).

Serialization in Qt - part 4

Submitted by mimec on 2012-08-22

I the last few posts I wrote about serializing data in an extensible and effective binary format using QDataStream. So far it was focused on value types and simple structures that can be easily converted to QVariant and QVariantMap. In the last post I mentioned that creating objects dynamically based on class name requires implementing some kind of an object factory. Now let's analyse what is needed to serialize an entire hierarchy of abstract objects that refer to one another. Note that this is a very complex topic and there is no single, universal solution, so I won't provide the full code. Instead I will discuss what is necessary to craft such solution depending on the exact requirements.

Let's assume that we're serializing a project which consists of shapes of various types - circles, squares, etc. There are also some complex shapes, like groups or layers, which consist of other shapes. The first difficulty is that shape is an abstract type, so we need to store the actual class name along with the object data in order to be able to re-create the object upon deserialization. This was more or less covered in the last post.

Another difficulty is that objects refer to one another by pointers, forming a graph of relations, in which one object may be accessed from many other objects. We need to ensure that the object is only serialized and deserialized once, and all other references must be correctly maintained. There even can be cyclic dependencies; for example a parent object can have a pointer to a child, and the child can have a pointer to the parent.

For sake of simplicity I will assume that each serializable class inherits QObject; that's not really necessary, but having a single common base class makes things easier. The class should also be registered in the object factory discussed before. Finally, it should implement the following interface, which provides methods for serializing and deserializing the object:

class Serializable
{
public:
    virtual void serialize( QVariantMap& data, SerializationContext* context ) const = 0;
    virtual void deserialize( const QVariantMap& data, SerializationContext* context ) = 0;
};

The data is stored in a QVariantMap for reasons that were also discussed in one of the previous articles, so that the file format is extensible and backward compatible. The context object is responsible for performing the serialization and deserialization. We will get to it in a moment.

Note that the Serializable class could be an abstract class which inherits QObject. All concrete classes could then inherit it and implement the serialization methods. However, in this case it wouldn't be possible to add serialization capabilities to existing subclasses of QObject, for example widgets. Using a separate interface gives us more flexibility. Although multiple inheritance in C++ is a very complex subject, it's very common in most object oriented languages for a class to inherit behavior and implementation from a single base class, and implement a number of additional interfaces.

An incomplete example of a serializable class containing a pointer might look like this:

class Shape : public QObject, public Serializable
{
    Q_OBJECT
public:
    void serialize( QVariantMap& data, SerializationContext* context ) const
    {
        data[ "Name" ] << m_name;
        data[ "Other" ] = context->serialize( m_other );
    }

    void deserialize( const QVariantMap& data, SerializationContext* context )
    {
        data[ "Name" ] >> m_name;
        m_other = context->deserialize<Shape>( data[ "Other" ] );
    }

private:
    QString m_name;
    Shape* m_other;
};

The serialize method of the SerializationContext first checks if the given object was already serialized. If not, it appends it to the internal list of objects and calls the serialize method on this object to store its data in a QVariantMap. Then it returns a handle to the object, which is a QVariant. Internally it contains an integer value identifying the object in the given context.

The deserialize method checks if the object with the given handle was already deserialized. If not, it creates a new instance of the appropriate class using the object factory and calls the deserialize method. Note that the object is not necessarily a Shape; it might actually be a subclass of it like Square or Circle.

Note that the context doesn't actually read or write any data from/to a stream. Instead, it stores a list of records, which include the pointer to the object, its class name and serialized data. So the handle is simply the position of the object in the list. The entire context can be written into the stream once all objects are serialized. Conversely, when deserializing, the context is first read from the stream, and then individual objects are deserialized.

We may serialize as many objects as we need using the same context, but we need to store the handles in the stream along with the context data, because we will need them when deserializing. Alternatively, we may serialize an entire hierarchy of objects by serializing the "root" object and ensuring that all children are serialized recursively:

QDataStream stream;
SerializationContext contex;

context.serialize( root );

stream << context;

In that case we don't need to store the handle, because we know that the handle of the first serialized object is always integer zero (not to be confused with invalid variant, which represents a NULL pointer):

QDataStream stream;
SerializationContext contex;

stream >> context;

Shape* root = context.deserialize<Shape>( QVariant::fromValue<int>( 0 ) );

So what does the SerializationContext class look like? This is an incomplete definition:

class SerializationContext
{
public:
    template<typename T>
    QVariant serialize( T* ptr );

    template<typename T>
    T* deserialize( const QVariant& handle );

    friend QDataStream& operator <<( QDataStream& stream, const SerializationContext& context );
    friend QDataStream& operator >>( QDataStream& stream, SerializationContext& context );

private:
    struct Record
    {
        QObject* m_object;
        QByteArray m_type;
        QVariantMap m_data;
    };

private:
    QList<Record> m_records;
    QHash<QObject*, int> m_map;
};

The serialize and deserialize methods are discussed below. The shift operators make it possible to read and write the entire context from/to the stream. The list of records stores information about objects, including their type and data. The map is optional; it simply makes lookup slightly faster for a large number of objects.

You can notice that both the serialize and deserialize methods are templates. Why not simply cast everything to void*? Also why the record and the map stores a QObject*, instead of a void*?

This is because of how multiple inheritance works in C++. Let's assume that you have a pointer to a Shape object. When you cast it to QObject*, and to Serializable*, you will receive two different pointers, that may be different from the original one. That's because in memory, the Shape object consists of a QObject, followed by Serializable, so an offset must be added or subtracted to convert one pointer to another.

You can safely cast pointers up and down the hierarchy of classes using the static_cast operator, and the compiler will ensure behid the scenes that the pointers are adjusted accordingly. But when you cast something to void*, you lose all the information, so casting it back to some other pointer may produce wrong results!

Let's take a look at the serialize method:

template<typename T>
QVariant SerializationContext::serialize( T* ptr )
{
    if ( ptr == NULL )
        return QVariant();

    QObject* object = static_cast<QObject*>( ptr );

    QHash<QObject*, int>::iterator it = m_map.find( object );

    if ( it != m_map.end() )
        return QVariant( it.value() );

    int index = m_records.count();

    Record record;
    record.m_object = object;
    record.m_type = object->metaObject()->className();
    m_records.append( record );

    m_map.insert( object, index );

    Serializable* serializable = static_cast<Serializable*>( ptr );

    QVariantMap data;
    serializable->serialize( data, this );

    m_records[ index ].m_data = data;

    return QVariant( index );
}

Notice how the pointer is explicitly casted to QObject*, and later it's casted to Serializable*? It's not possible to cast a QObject* to Serializable*, because they are unrelated classes, and forcing the cast by using reinterpret_cast or casting to void* would certainly crash the application. This is even more apparent in the deserialize method:

template<typename T>
T* SerializationContext::deserialize( const QVariant& handle )
{
    if ( !handle.isValid() )
        return NULL;

    int index = handle.toInt();

    Record& record = m_records[ index ];

    if ( record.m_object != NULL )
        return static_cast<T*>( record.m_object );

    QObject* object = ObjectFactory::createObject( record.m_type );

    record.m_object = object;

    m_map.insert( object, index );

    T* ptr = static_cast<T*>( object );
    Serializable* serializable = static_cast<Serializable*>( ptr );

    serializable->deserialize( record.m_data, this );

    return ptr;
}

Here the QObject* is first casted "up" to the actual type, T*, and then "down" to Serializable*. It doesn't matter if T is a Shape and the object is actually a Square or Circle, because all subclasses of Shape have the same layout of base classes.

Other than that, the code is quite straightforward, though there is another gotcha when dealing with circular references between objects. The record must be appended to the list before the serialize method is called on the object. This way, if the same pointer is encountered while serializing the object, it is not serialized again, which would lead to infinite recursion.

Blog

Fuzzy optimizations in QTransform

Tags

Qt and ZIP files

Tags

Descend 0.2 released

Tags

Crossroads

Tags

Serialization in Qt - part 4

Tags