Introduction: image quality
When we talk about image quality, we usually refer to the degree of fidelity of that image to the real scene it is capturing.
Objective parameters related to image quality:
- Resolution (sharpness)
Level of detail that we can see in the image - Color
Fidelity to what we see with our eyes in the real scene - Absence of artifacts
Artifacts appear in the image but are not part of the real scene: they can be, for example, digital noise, aliasing effects ( Moiré ), vignetting, flashes (flares), optical aberrations …
Let’s assume that the optical part (objectives) is ideal, perfect.
If there were no digital noise of any kind and assuming ideal characteristics of the sensor and the optical part: the image quality would be related to the sensor’s resolution.
Higher resolution means more detail in the image and more fidelity to the real scene.
In the real world, the light’s very nature is associated with statistical fluctuations that translate into photonic noise ( shot noise ).
Also, the sensor electronics introduce additional noise: thermal noise, reading noise, etc.
We can see electronic noise as a kind of base noise, which grows with temperature.
Photonic noise grows with the number of photons received but in a slower way.
Specifically, the amount of noise is related to the square root of the total number of photons. If each cell receives an average of 100 photons, the noise level will be about ten photons in each one. If it receives 10,000 photons, the average noise would be about 100 photons, etc.
Since noise is always present, the relationship between the amount of information (signal) and noise is always present. The signal-to-noise ratio is known as SNR (Signal to Noise Ratio).
A very high signal-to-noise ratio means that there are many signals compared to the noise level; in those cases, the noise will be practically indistinguishable. The image quality will be outstanding.
As the signal-to-noise ratio decreases, the image quality will deteriorate since noise will be perceived in the image: grainy and colored dots.
A low signal-to-noise ratio equates to poor quality images, in the sense that the image is not true to the actual scene.
Taking into account the relationship between photon noise and the total amount of light and the fact that electronic noise is more or less constant, the basic rule of thumb for digital photography is as follows:
The more light (total number of photons) the sensor collects, the higher the overall signal-to-noise ratio and the better the image quality.
Cell size (pixel density) and noise
We are now going to focus on a single sensor cell.
Given a certain intensity of light that reaches the cell (photons per second): the more surface the cell has, the more photons it will capture per unit of time.
Not all photons are converted into electrons. In modern sensors, the quantum efficiency (QE – Quantum Efficiency) would be in the order of 40-50%. Furthermore, the quantum efficiency depends on the light’s wavelength: blue light has a higher efficiency than green light, and green light has a higher efficiency than red light.
In CMOS sensors, electrons are generated from photons stored in the cell reservoir (condenser). The cell’s size and all its circuitry can also limit the tank’s maximum capacity at such small scales.
For example, to have a reference with real camera sensors:
model / capacity per cell / Mpx / sensor size
Sony a7S II: 160,000 electrons (12Mpx) Full Frame
Sony a7 III: 95,000 electrons (24Mpx) Full Frame
Nikon D850: 60,000 electrons (45Mpx) Full Frame
Nikon D3400: 35,000 electrons (24Mpx) APS-C
Olympus OM-D E-M1 Mark II: 34,000 electrons (20Mpx) Micro 4/3
At the cell level, the larger the collecting surface and electron pool, the more the signal-to-noise ratio can be maximized.
From this point of view, it is interesting to have cells with the largest possible catchment surface.
This conflicts with the sensor’s resolution since at higher resolution, there will be more density of cells per unit area and therefore less area per cell.
That is, given sensor size, for example, APS-C format, the higher the resolution (megapixels), the lower the overall performance at cell level: each pixel in the image will have, on average more noise compared to sensors with lower resolution.
Let’s see a simple example assuming two sensors. Each cell of sensor A is four times larger (area) each cell in sensor B.
Imagine that we need to take a photo of a low-light scene with a minimal exposure time (a typical situation where we have to raise ISO to get a proper exposure).
Suppose A’s cell collects 4000 electrons. Since cell B has an area that is 25% of cell A, it collects 1000 electrons in that interval.
The photon noise of A will be about 60 photons. That of B will be about 3rd photons. And let’s assume that the thermal noise is negligible. The signal to noise ratio in each cell will be approximate:
SNR A = 4000/60 = 67
SNR B = 1000/30 = 33
If we take into account thermal noise, etc., the difference would be even more significant.
With base ISO that photo would come out very dark (there are few electrons compared to the maximum capacity of the cell), so we would raise the ISO to obtain the desired level of exposure. Increasing the ISO is equivalent to scaling each cell’s value (amplifying its value), and we would have two images of the same scene.
The image taken with sensor A would have pixels with a better signal-to-noise ratio, more faithful to the scene. The image taken with sensor B would have more resolution (in that sense, it would be more faithful concerning the scene’s details), but each of its pixels would have more noise.
Sensor size and noise
Cell size (pixel density when comparing sensors of the same size) is a significant factor in sensor performance, but it is not the only factor.
We have talked about ‘image quality at the cell level; that is, the signal-to-noise ratio tells us that large cells more closely represent the brightness level of the actual scene at that point.
But if we think about it, what matters is the quality of the image as a whole, as a whole, including a faithful reproduction of the tones and also of the details (resolution).
What’s more, the important thing is the image’s quality on its final support: printed photography, posters, monitor, phone or tablet screen, etc.
We are going to imagine several cases to compare and draw conclusions. In all of them, we are going to assume that the sensor technology is similar:
We have sensor A and sensor B like the ones in the previous example. Both are the same size but different resolution, each cell of A is four times larger (in surface area) than that of B.
We have two sensors, C and D. C, twice as big as D (for example, a Full Frame sensor compared to a Micro 4/3 sensor). But D has much lower resolution and, therefore, larger cells than C.
We have two sensors, E and F, with the same resolution but different sizes. For example, imagine a 20Mpx Full Frame sensor and a 20Mpx Micro 4/3 sensor.
Sensors of the same size but different resolution
First case: if we have two sensors of the same size but the different resolutions (A and B), which will offer better image quality?
Well, in general, the two will give a similar result.
Keep in mind that the images taken with sensors A and B of the same scene will look different if we zoom in 100%. In one, we will have fatter points but more homogeneous in terms of tonal variations due to noise. We will have more detail but more tonal variability (more granular at that level of detail).
But comparing the 100% enlarged images is of no use to us. We have to compare the images in their final support. For example, we can print the two images at the same physical size, or we can compare the two rescaled to the size of a monitor screen.
Let’s suppose that we rescale the image from sensor B to have the same resolution as sensor A. This way, we can compare on equal terms.
What happens when you resize an image? If we rescale by averaging, we take groups of points from the image, average their value, and convert them into a single larger point with that average value (brightness). What we achieve is to increase the signal-to-noise ratio of that new point.
This occurs because scene information usually has a strong spatial correlation. In contrast, noise doesn’t have any spatial correlation (unless it’s some kind of design default noise pattern, etc.). Averaging reinforces information and reduces noise.
Smaller sensor but with larger cells
Second case: A smaller sensor but larger cells will offer better image quality than a larger sensor with smaller cells?
In general, no. The larger sensor has a total capture surface.
For a given exposure, it is determined per unit area. That is, given a light intensity (photons per second) and a specific exposure time (seconds), the large sensor will capture more photons, more light.
Although at the pixel level, we can see that the small sensor more faithfully picks up the tones of the scene, the image as a whole will have a better signal-to-noise ratio in the large sensor.
And again, when rescaling to compare or compare the image on its final support (print, monitor screen, etc.), the image from the large sensor will, in most cases, have a better quality in terms of noise that is perceived.
Same resolution but different size
Third case. At the same resolution but different sizes, which sensor will offer better image quality?
The larger sensor offers more quality. Because it has a total catchment surface and because each of its cells is larger.
When printing, for example, each point of the small sensor has to be scaled further to cover the same surface on the paper to the large sensor point.
Are these ‘rules’ always followed?
No.
They are generalizations that make sense when comparing general-purpose photographic sensors. There are specialized sensors designed to cover particular situations.
For example, the sensors of the Sony a7S are specifically designed to allow them to work in low light situations; they achieve an excellent signal-to-noise ratio and can raise the ISO value a lot while maintaining incredible image quality.
These sensors sacrifice resolution (12Mpx) in favor of low light performance and are primarily intended for video.
The other extreme would be the sensors with a high resolution (40Mpx, 50Mpx…) designed for situations in which the light conditions are always very good or controllable. Studio photography, fashion photography, product photography and, architecture
What effect has the technological development of sensors?
It is one of the most important factors, at least to date.
Each new generation of sensors has increased its performance.
For example, micro-lenses allow photons to be concentrated on the cell’s capture surface, and they improve the angle of incidence of light rays.
The BSI technology allows more effective capture surface within the space of the cell itself.
More and more useful collection surface is being achieved, minimizing the separation between adjacent cells.
The electronics associated with each cell are more efficient and produce less thermal noise.
Two sensors from different technological generations are not comparable in terms of performance (we are talking about noise)
Does this mean that a camera from 5 or 10 years ago is worthless?
Absolutely. What it means is that if a camera from 5 years ago allowed me to take photos of an acceptable quality up to ISO 800, a current camera of the same range might allow me to take the same images at ISO 3600.
Do I need to take photos at ISO 3600 in my daily life? If I need it, then I am interested in changing my camera. If I don’t need it for my usual photography type, my camera from 5 or 10 years ago will take good photos. I have to know its limits, as in any other camera, no matter how advanced it is.
ISO and noise
Since we are talking about ISO …
Although we have already mentioned it, it must be emphasized that raising ISO does not increase image noise.
The noise was already there in the sensor cells along with the signal (scene information).
Raising the ISO makes that noise in the image more evident and more visible since we are scaling the brightness value of each cell/pixel. We amplify each point’s variations (brightness) for its expected average value depending on the scene.
When we compare cameras for their behavior at high ISOs, we see which camera manages to minimize noise the most. The system that controls the sensitivity (ISO) does not do any magic, and it only does a scaling/amplification of what is in the cell.
Some sensors achieve a thermal/electronic noise level so low that it becomes negligible to photonic noise. They are known as ISO invariant sensors.
Imagine taking two photos of the same scene with an ISO invariant sensor (of course in RAW format ).
The first photo is exposing correctly, let’s say at ISO 1600. The second photo at ISO base (ISO 100, let’s say).
Logically, the second photo will come out very underexposed; it would be four light steps darker than the first. But if the exposure is raised four stages in the development program, the result would be practically identical to that of the first photo: tones, brightness, and noise level.
As the thermal noise is negligible, it does not matter to amplify the signal at the cell level inside the sensor (raise ISO) than to amplify it later in the development program.
There is always thermal noise and other small noise sources, and that invariance at ISO is not logically perfect. It cannot be maintained beyond a specific range of light steps, and there will always be a little more noise in the underexposed image.
It is simply to give you an idea that the ISO of a sensor does not do anything magical or add or remove noise.
Exposure and noise
From everything we have seen about noise in sensors, the conclusion would be: to maximize the signal-to-noise ratio, whatever the sensor or camera, the most important thing is to use or collect as much light as possible.
When the lighting conditions of the scene are good, all cameras take good photos.
Even cameras with very different sensors in size and resolution. We are talking about quality to noise.
Then we would have to see, for example, the sharpness due to the optics used by the camera, etc.
When the lighting conditions are not so good, for example, in low light situations where we need to shoot with a high shutter speed, the sensor’s size and technology make all the difference. The same is true for high dynamic range scenes, which combine very dark areas with very bright areas.
Also, in the video, we have the limitation of the shutter speed, and in scenes, with less light, it is inevitable to raise ISO to get adequate exposure.
In any case, the important thing is to try to make a good exposure, collect as much light as possible at the camera’s base ISO if possible.
If we shoot in the RAW format, we can overexpose a little without burning the most illuminated areas because we would lose that information.
This technique is known as histogram right ( Exposure to the Right – ETTR ).
By overexposing, we maximize the number of photons in all cells and minimize noise (we will achieve a higher signal-to-noise ratio).
Then in the development program, we lower the exposure to the right level. We would achieve a noise reduction; especially, it would be appreciable in the darkest areas (cells that receive fewer photons and the signal-to-noise ratio is worse).
With the most modern sensors (invariant to ISO), this technique is not as effective. It must be taken into account that later the images will need an editing process (at least in the development program).
Also, bear in mind that there is the risk of burning the highlights and losing information when doing the right. The light metering systems and the cameras’ histograms are not so accurate as to adjust the high exposure lights to the limit. Lights. It will be up to each photographer to determine if, with their equipment and type of photography, it is worth using this technique or not.
In short, it is essential to expose well. The more light we get from the scene, the better.
Image processing
Can noise be removed from an image in editing or by the camera’s internal processor?
It is very complicated because the information and the noise are mixed once they leave the sensor.
We have already commented that averaging from adjacent pixel groups reduces noise (random noise with little spatial correlation). It is equivalent to a low-pass filter that removes the high frequency: it removes all the high spatial frequency of the noise and removes the scene information’s high spatial frequency.
Averagingaffects smoothing the image and loss of sharpness, especially in the details: image edges, textures, etc.
There are ‘smart’ averaging techniques that take into account, for example, whether it is a homogeneous area of the scene: a patch of sky, an area with little texture, etc. It is precisely in those areas where the noise is most visually noticeable and where it would be most effective to apply an averaging.
Image stacking techniques: take multiple images of the same scene (with a tripod or with shallow exposure times) and stack them to average the value at each point.
The values that correspond to the scene are reinforced, while the variations due to noise tend to cancel out since they have a random spatial distribution.