Comparison by overlay of 2D drawings in PDF, image or CAD

This is the most common comparison found in all software that compares PDF files with drawings. This is often the only comparison method used by these software programs to find similarities and differences.

The principle is different depending on the file type. The method for Image files (TIF, TIFF, PNG, BMP, JPEG, etc.) is different from PDF files (PDF, PS), and CAD files (EMF, WMF, SVG, STEP, IGES, HPGL, etc.), the first compares pixels, the second compares graphic elements.

How to compare image files by overlay

The bitmap or raster image is a digital image composed of a series of pixels. Each pixel has its color, and when placed side by side, they form an image. The pixels are arranged in an X, Y matrix representing the height and width of the image.

The overlay comparison consists of comparing the pixel of each image, at the same X and Y position.

In principle, the overlay comparison seems quite simple but several important elements come into play.

Pixel color

The overlay comparison software tests the pixel color “exactly”. The pixels, on each of the files to be compared, are exactly the same to be considered similar, otherwise they are different, this can cause many problems for JPEG images, we will see this later.

In the “COMPARE” software, pre-processing transforms the colored image into a black and white image. At the industrial design level, color is not used. Before the use of CAD and CAD software, plans were made with pencil or rotring, color was very rarely used, it is still the case.

First, we transform the image into a gray level image (using a classic algorithm for transforming a color image into a gray level image). Then, we do a thresholding to transform the gray levels to black and white image.

In the software options, the “COMPARE module” section allows you to modify the value of this threshold. This way, you adapt it to each type of document processed.

This method has 2 advantages :

  1. Color is not a differentiating element in designs
  2. The undesirable effects of JPEG coding on essentially black and white images are avoided.

The JPEG (Joint Photographic Experts Group) format, widely used today, is a pixel-based image file format. It uses a lossy compression method: the JPEG format manages to reduce file size but it causes the display of artifacts (aliasing on the edges, light blur or noise) detrimental to the quality of the image. To avoid these artifacts and the posterization effect we transform the image into gray level and we do the thresholding indicated above.

Thanks to this technique, used by the “COMPARE” tool, the comparison by overlay of 2D drawings is much better than the comparison which is based solely on color.

Tolerance

This is the second important point in comparing images by overlay. Indeed it is possible that, between the two versions of image generation, the software used to produce these images has evolved. The calculations or the precision of the calculations are no longer exactly the same. The point which was to be displayed at the coordinates X, Y is found at the coordinates X + Delta reading the similarities and differences between the two Image files.

In the “COMPARE” tool, you can each time adjust the tolerance that is used to compare the 2 images of the 2 files. By default the tolerance is 1 pixel, you can specify a tolerance in number of pixels between 0 and 99.

The uniqueness of overlay processing

Almost all software automatically makes the comparison by overlay in one go, the result cannot be modified, it is frozen.

With the “COMPARE” tool, you can compare as many times as you want by superimposing the entire image or a specific part of the image with a different tolerance each time.

Principle of overlay comparison for PDF files

The content of PDF files is much richer than Image files. However, some comparison software does not use this richness. They directly transform the PDF file into an Image file and we find ourselves in the case described above.

When comparison software uses the wealth of elements contained in PDF files, several important elements can lead to better detection of similarities and differences, some of which are the same as for Image files.

The color of the elements

As with Image files, the “COMPARE” tool does not use color as a differentiating element. The principle of transforming the image into “Black and White” is the same as for pixel images. This choice prevents the detection of differences due to color.

This may also interest you :

Tolerance

In the “COMPARE” tool, the tolerance is calculated based on the average thickness of all the lines, but you can modify it at any time and choose the tolerance that best suits the differences that you want to identify. You can repeat the overlay processing as many times as you want, by changing the tolerance value, as for Image files.

The uniqueness of overlay processing

In general, software automatically makes the comparison by overlay and the result cannot be modified, it is frozen.

With the “COMPARE” tool you can do the comparison as many times as you want by superimposing the entire image or a specific part of the image with a different tolerance each time.

The nesting of elements

During the numerous tests that we carried out, differences were detected even though they were not differences. We took this fact into account to improve this detection.

Let's take for example 2 segments which, in the first file, form an 'L', and, in the second file, exactly the same shape but, this time it is a poly line. If we compare the files element by element, the elements that make up these 2 “L” are different, therefore shown as different but visually they are similar.

A box, to be checked in the options, activates or deactivates the possibility of considering that these elements are similar or different.

Why enable or disable this option? In some cases, if this option is activated the result may be surprising. If we take the example above but, in the first file, there is only one of the 2 segments, this segment will be considered “identical” because visually present on the second file. The “L” of the second file will be considered different (because there is only one of the 2 segments on the first file), this can be confusing when interpreting the similarities or differences.


Click here to learn more about 1A3i’s "COMPARE" software.