Sunday, March 18, 2007 5:33 PM Erwyn van der Meer

Diving into C++ for Flickr Metadata Synchr v0.6.0.0

The situation

I am currently working on version 0.6.0.0 of my Flickr Metadata Synchr tool. The goals for my open source project on CodePlex are described on the Flickr Metadata Synchr wiki page and you can always find the latest status there.

At the moment the latest public release is version 0.5.5.0. The feature set for v0.5.5.0 is roughly:

  • Allow you to select a Flickr photoset and a local directory with images.
  • Load metadata from both local and Flickr images into internal metadata structures.
  • Compare these metadata structures and synchronize them.
  • Update metadata on Flickr after the synchronization.

One of the features planned for v0.6.0.0 is updating the XMP and IPTC metadata in locally stored images. I was planning on doing this through the Windows Imaging Component (WIC) which is part of the .NET Framework 3.0. WIC is also available as a separate download for Windows XP and Windows Server 2003.

Windows Presentation Foundation provides a nice managed API for reading and writing metadata through WIC. It provides the SetQuery and GetQuery methods on the BitmapMetadata class. I was already using the GetQuery method, which works fine. However, I hit a snag when I wanted to use the SetQuery method to update metadata.

Plan A

There is a way to do this through the InPlaceBitmapMetadataWriter class. It just touches the metadata structures in the image file and doesn't have to read or write the entire stream with pixel information. This will give you excellent performance and so you don't run the risk of having to reencode the pixel stream or loosing metadata. The sad thing is that it almost never works. The image file often does not have enough room in its metadata structures to allow metadata fields to be filled or updated. When you try to save the updated metadata, the InPlaceBitmapMetadataWriter fails. That is probably why the save method is called TrySave. By the way, the code sample on that MSDN Page is dead wrong. If you call TrySave before updating metadata, it always succeeds. Probably because there is nothing to save yet. You have to call it after you update the metadata, and then it returns false ;( Which means your metadata was not updated successfully.

Plan B

So I tried plan B: Creating a new image file by writing out a copy of the original image, but now with updated metadata. This means you have to grab the original BitmapFrame from the JpegBitmapDecoder. Clone it, update its metadata and write it out again using the JpegBitmapEncoder.

This is where I hit a major problem. The Save() method on the JpegBitmapEncoder almost always fails with an InvalidOperationException with the error message "Cannot write to the stream". When the encoder is able to write out the image, the JPEG turns out to be reencoded with a different quality than the original. This is noticeable through a significant change in size of the file. This happens even though I specified the BitmapCreateOptions.PreservePixelFormat option when opening the image with the decoder. Googling (or Windows Live Searching if you will) for a solution didn't yield anything useful.

Plan C

I had to come up with a Plan C. The Windows Vista Shell is obviously able to update metadata in images without affecting the JPEG quality and without creating a copy of the image file. This led me to an MSDN article titled "Photo Metadata Policy". This is the introduction:

Metadata (file properties) for photo files can be stored using multiple metadata schemas, in different data formats and in different locations within a file. In Windows Vista™, the Microsoft® Windows® Shell provides a built-in property handler for photo files formats, such as JPEG, TIFF, and PNG, to simplify metadata retrieval.

When a piece of metadata is present in different underlying schemas, the built-in property handler determines which value to return . For instance, the Author property may be stored in the following locations in a TIFF file:

  • The Creator tag in the XMP Dublin Core schema:
    /ifd/xmp/purl.org/dc/elements/1.1/dc:creator
    
  • The Artist tag in the EXIF schema:
    /ifd/{ushort=315} 
    
  • The Artist tag in the EXIF schema embedded in an XMP block:
    /ifd/xmp/ns.adobe.com/tiff/1.0/tiff:artist
    

On read, the property handler determines the value that takes precedence over the others that exist in the file and returns it. On write, the property handler makes sure it leaves each schema in a resolved and consistent state with the others. This may mean either updating or removing the tag in question in a given schema.

This would also help to solve another piece of the metadata puzzle: what to do with the several different options of putting metadata in image files (XMP versus IPTC, multiple possible XMP places, etc.). After updating an image, I want the metadata in the different blocks to be consistent. WIC doesn't help with this. You have to sort it out yourself. The Windows Vista Shell does help with metadata reconciliation.

So all seems to be well. Just use the Shell API to update the metadata. I would love to be able to do this from C#. Yet that doesn't seem to be possible or it is extraordinarily difficult. The "file property" handling is implemented in propsys.dll through a COM based API. But you can't add a reference to this COM library in a C# project. It doesn't have a type library ;( The only option I can find is to use C++ and use the propsys.h and propsys.idl files that are distributed in the Windows SDK. This is horrible. I guess I have to dust off my C++ skills to be able to call a brand-new Windows Vista API. WTF?!

The "Longhorn" promise for managed code

Do you remember the promises Microsoft made back in 2003 for the new Windows Client OS codenamed "Longhorn"? I sure do, since I visited the PDC03 conference where this was all announced. Microsoft promised us a brave new world where all Windows APIs could be accessed easily from managed code. Three and a bit years later we have a new Windows Client OS called Vista that doesn't live up to this promise. Microsoft has implemented new APIs that seem to be inaccessible from managed code other than through C++.

Now I can understand why part of the promise was lost during the infamous "Longhorn Reset" at Microsoft. Microsoft's ambition to completely wrap all existing Win32 APIs in WinFX was too big. But why Microsoft would be creating new APIs without managed code in mind is beyond me...

I found some C++ code on the blog of Ben Karas that is indeed able to update metadata  I hate having to add this C++ code to my project. It would require people to have Visual C++ and the Windows SDK (especially the Windows Vista header and library (*.h, *.idl, *.lib) files) installed to be able to build my code in Visual Studio.

A plan D might be to manually create C# wrappers for the COM interfaces of propsys.dll. This article describes how to do this for COM interfaces in general.

Filed under: , , ,

# re: Diving into C++ for Flickr Metadata Synchr v0.6.0.0

Monday, March 19, 2007 3:33 PM by Alois Kraus

Hi Erwyn,

did you check out the tlbimp tool of the .net framework sdk? You can create from a com dll automatically a COM Wrapper dll. The automatically generated one is normally working but from time to time it does need some fine tuning with ildasm/ilasm.

Yours,

  Alois Kraus

# re: Diving into C++ for Flickr Metadata Synchr v0.6.0.0

Tuesday, March 20, 2007 8:11 AM by Erwyn van der Meer

Hi Alois. The propsys.dll has no type library (only an .idl file in the Windows SDK). So I can't use the tlbimp tool for this DLL.

# re: Diving into C++ for Flickr Metadata Synchr v0.6.0.0

Tuesday, March 20, 2007 12:40 PM by Alois Kraus

Hi Erwyn,

in that case you could create a type library with the MIDL compiler and then try to import this one with tlbimp.exe

Yours,

  Alois kraus

# re: Diving into C++ for Flickr Metadata Synchr v0.6.0.0

Friday, March 23, 2007 2:40 PM by Erwyn van der Meer

Hi Alois. Thanks for the suggestion.

I have created a type library using the Windows SDK v6.0 versions of midl.exe and tlbimp.exe. However, tlbimp.exe gives massive amounts of warnings and the generated IL contains lots of ComVersionLossAttributes. One of the most important structs used by propsys.dll, PROPVARIANT (http://msdn2.microsoft.com/en-us/library/aa380072.aspx), is completely unusable as the ".NET type" tag_inner_PROPVARIANT in the interop assembly. The union inside the PROPVARIANT struct is completely lost. All that remains of that in tag_inner_PROPVARIANT is a member of type __MIDL___MIDL_itf_propsys_0002_0082_0001. This is an empty struct without members with a SizeAttribute and PackAttribute with value 8.

The blog post http://blogs.msdn.com/dimeby8/archive/2006/12/11/wpd-property-retrieval-in-c.aspx might help me with recreating some of the necessary interop structures.

But all in all, the managed C++ route seems to be easier.

# re: Diving into C++ for Flickr Metadata Synchr v0.6.0.0

Tuesday, April 03, 2007 5:03 AM by Leon

Correction, you may have allocated yourself, depending of course on the API you're using :)

# When will the next PDC be?

Friday, June 01, 2007 10:33 AM by Erwyn van der Meer

PDC07 has been postponed indefinitely. Dennis already warned me just before MIX07 that Microsoft wouldn't

# Released FlickrMetadataSynchr v0.8.0.0 which is fully functional

Sunday, August 19, 2007 2:23 PM by Erwyn van der Meer

After a long day and night of coding, I released version 0.8.0.0 of my Flickr Metadata Synchr tool on