How autofocus works on cameras

The autofocus system of a camera is currently one of the most critical and valued functions.

What is focusing on a camera?

In the entry on the depth of field, we saw that in photography, an element of the scene is considered to be in focus when the points of the image that correspond to that element are minimal (they form a circle of confusion so small that the human eye sees it as a point).

In practice, when we see an image, the elements in focus appear with much contrast, its edges and lines are very well appreciated, the separations between its parts are sharp and differentiated.

Items that are not in focus appear blurrier, to the point that they may become unrecognizable.

The focusing process involves moving the lens relative to the sensor plane:

  • In the case of a simple lens, when we move the lens away from the sensor, we focus on closer objects.
  • If we bring the lens closer to the sensor, we focus on more distant objects.

There will come a time when the lens’s focus practically coincides with the plane of the sensor. In that case, we will be focusing on infinity: all distant objects (from a certain distance) will be in focus.

At the other extreme, to focus on objects very close to the camera, the lens has to be separated from the sensor’s plane.

In macro photography, specific objectives that allow focusing from very close or extension tubes with shared goals are used to ‘move away’ the optical center and focus with the camera very close to the subject to achieve higher magnification.

Camera lenses are made up of a lens system, but the principle of focus is the same.

What usually happens is that not all the lenses move within the objective, but rather a specialized group of lenses whose adjustment is equivalent to moving the system’s optical center.

Most lenses include a manual focus ring on interchangeable lens cameras (reflex cameras and mirrorless cameras).

a simple camera lens



The first autofocus cameras emerged around 1980.

All cameras include automatic focus systems, some of them even without the possibility of manual focus.

In advanced compact and interchangeable lens cameras, there is the option of working with autofocus (this is common in most situations) or with manual focus using the lens’s focus ring.

The autofocus system works as follows:

  • The camera incorporates a detector that analyzes typically a small part of the image (the area of ​​the scene that you want to focus on)
  • The electronic system decides if that piece of the image has contrast or if it is blurred.
  • Contrast is usually detected from abrupt transitions between elements of the scene: edges, lines, textures.
  • If the system determines that the image is blurred, it sends the order to move the focus lens slightly. And re-evaluate.

There comes a time when the system determines that it has achieved the maximum contrast, the full focus, for the area we want to focus on.

auto focus camera lens

What is the ideal focus system like?

It would be a system that:

  • Get the focus fast, the faster, the better, ideally instantaneous.
  • Get a precise focus on the point of the scene we want to focus on
  • Get the focus on any circumstance.

Such an ideal system does not exist, although focusing systems are becoming faster, more precise, and versatile.

You should also note that the focusing speed depends on the entire system as a whole: detector precision, an algorithm for adjusting the lenses’ position (how and how much the lens has to move), speed, and accuracy of the focusing motor.

And it also depends on external conditions: the amount of light in the scene, the texture of the object we focus on.

We are going to see what are the techniques that are currently used in focusing systems, with their pros and cons.

camera lens for auto focus

Phase detection focus (reflex)

It is the system used by most SLR cameras.

The mirror of an SLR camera is made up of two mirrors.

The primary mirror sends the image to the optical viewfinder. However, it is a mirror that lets a certain amount of light pass to a second mirror, called a secondary mirror or sub-mirror, reflecting the image towards the phase detector.

The phase detector is a light sensor, which works similarly to the image sensor.

However, this sensor only receives a tiny part of the scene, for example, an area in the center of the image (or the location indicated by the focus point selected on the camera).

The focus sensor is specialized in detecting the scene’s light transitions, for example, an edge of an object, a line, a texture, something that generates a significant contrast between two points of light. This transition turns into an electrical signal, which we could imagine as a spike.

For each focus point, two separate sensors triangulate.

Each of them receives the same image of the area we want to focus on. When the image is in focus, the peaks of the two electrical signals coincide. When the image is out of focus, the peaks don’t match, and the electronics can calculate precisely where the lens needs to move and how far we are from the focus point.

Some focus points only detect vertical transitions (vertical lines or edges of the scene), others only horizontal changes, and some focus points detect both are called focus points cross (cross-type AF point).

The phase-detection focusing system is speedy and entirely accurate.

As it is known at all times where the lens has to move, it is a system that works very well both for fast focus and for tracking moving objects since the electronic part can even introduce a certain margin of prediction.

For phase detection to work correctly, a certain amount of light is required in the scene. It is also necessary that the set contains those horizontal or vertical lines, borders in short, that the location (at least at the point of focus) has a particular texture.

One of the disadvantages of the system is its construction complexity.

The problem comes because the phase detection sensors are located in a different plane from the image sensor.

They do not precisely detect what reaches the primary sensor. They are independent elements.

Therefore, the entire system must be entirely built (mechanical part) and synchronized (mechanical and electronic component). Each camera, one by one, has to be calibrated with great precision. Otherwise, it will present back focus or front focus problems; all the images will appear out of focus.
Another problem with the traditional phase detection approach in SLRs is that this system is no longer operational when the mirror is raised.

For this reason, when we use the screen (live view) for photography instead of the optical viewfinder, the focus is usually slower and sometimes much slower depending on the camera.

And the same is true when using the SLR camera for video, as the mirror remains raised all the time. The pure phase-detection focusing system (using separate sensors) is not suitable for video.

Phase Detection Focusing is known in English as PDAF ( Phase Detection Auto Focus ). This nomenclature is also used to refer to systems that use hybrid focus: phase + contrast, both in cameras and mobile phones.

DSLR camera and lens closeup

Contrast detection focus

It is the system used by most compact cameras and many SLRs when working in live view mode (through the screen). From a technical point of view, it is a straightforward system. It does not need external elements, or additional sensors, or complex electronics, or calibration.

Once we select the area we want to focus on in the scene, the processor analyzes it directly from the sensor’s image.

The system sweeps, moving the focusing lens, and at each position, calculates the contrast level of the image. Scanning stops when the maximum contrast level is determined, and the processor moves the lens to that position.

In principle, it is a trial and error process because the system does not know where to move the lens or how much to move it, and therefore it is a relatively slow process compared to phase detection.

Traditional contrast detection focus movement is typical: lens travels forward, backward travel, ahead of a little kind of back and forth until you get focus. In English, it is known as autofocus hunting.

When continuous focus is activated, this swing is constant because the camera has to ensure in real-time that the distance between the camera and the object we are focusing on has not changed.

In photography, this effect (with continuous focus) can be annoying when viewing the scene through the electronic viewfinder or the rear screen. The photograph itself, the final image, appears correctly in focus.

The video (with continuous focus) is more problematic because the algorithms have to track the contrast of the area of ​​interest. At the same time, they have to minimize the focus hunting effect that can become very annoying.

A balance has to be found between the system’s response to changes in the scene and the focus’s precision. That is why plans based on video contrast-detection tend to have slower reactions and transitions (going from focusing on one object to another located on different planes) are not as smooth.

However, contrast detection also has advantages:

  • The focus plane is the sensor plane, with no back focus / front focus problem. It is a process that feeds itself; therefore, when the focus is achieved, it is usually exact (maximum contrast)
  • No need for specific focus points; you can focus using any area of ​​the image
  • Focus can be achieved with less light in the scene
  • Can find focus in locations where there are no very sharp vertical/horizontal edges
  • Very complex recognition and predictive algorithms can be applied, for example, face recognition for faster focus and tracking

The contrast-detection approach is known in English by the acronym CDAF ( Contrast Detection Auto Focus ). This nomenclature is also used to refer to systems that do not use a hybrid approach, for example, in cameras or mobiles that only rely on contrast detection.

Nikon camera and its lenses

Hybrid focus built into the image sensor

This is probably the future system, which is already used in practically all cameras today with different variants.

The idea is straightforward: instead of using a separate sensor to do phase detection, why not use the image sensor itself?

Image sensors using hybrid technology include areas (pixels) dedicated exclusively to phase detection.

These unique pixels are distributed throughout the sensor area.

There may be many phase-detection zones on the sensor. For example, the Sony a6000 EVIL (mirrorless) camera
includes 179 phase-detection points throughout the entire area of ​​its APS-C sensor.

In general, these phase detectors built into the image sensor are not as effective as the independent phase detectors of SLRs.

One has to think that independent detectors are specialized sensors. With precise and high-speed electronics, internal optics are also optimized for phase detection and separation between pairs, allowing a more accurate triangulation.

But the advantage of the hybrid system is that the phase detectors are precisely in the sensor plane (they are part of it). No calibration error problems.

Another great advantage is that the two focusing techniques can be combined. Phase detection tells the processor to move the lens very quickly, and contrast-detection is responsible for fine-tuning the focus and achieving the highest possible contrast.

Furthermore, compelling prediction algorithms can be implemented, using many phase-detection points simultaneously and many contrast-detection areas.

All manufacturers are developing new technologies and algorithms based on hybrid phase/contrast detection.

The speed and precision of focus based on this technology are increasing.

One of the drawbacks of hybrid focusing systems is that the cells dedicated to the focusing system do not contribute to light detection to generate the final image. We can imagine a photo with ‘gaps’ that correspond to the position of those focus points.

The camera internally has to do some interpolation to reconstruct a complete image. And this process can lead to a noticeable banding effect, especially when using a very high ISO or recovering shadows in the development/editing process. In these extreme situations, a pattern can be seen in the image, usually in bands of different tones or colors.

Under normal conditions, these effects or patterns in the image are invisible.

In the video, the hybrid focus system’s advantage is that the phase detection part knows if the camera’s distance and the scene’s object have changed. It does not have to analyze the contrast (trial and error) continuously. The focus hunting effect is greatly minimized.

With the face (and eyes) detection algorithms, the phase-detection system helps determine how far away the main subject is. The method based on contrast detection is in charge of analyzing and detecting the patterns (face, eyes, etc.). In a system based solely on contrast detection, it is sometimes challenging to identify a look in an image that is totally out of focus, without distinctive features or patterns.

Dual Pixel CMOS Focus

This Canon technology is a hybrid focus system.

It uses phase-detection built into the image sensor, but all the sensor pixels are used for phase-detection focusing.

Each sensor’s pixel comprises two independent cells (two photodiodes, A and B), each with its micro-lens.

At the moment of focusing, each pair of cells (in the area we are focusing on) works as a phase detector and triangulates the distance to the object and focus.

Another way of looking at it would be to think that the camera has two images: one formed from cells A and the other formed from cells B. By superimposing these two images and seeing their differences, the camera can determine where it has to move the image. Focus lens.

The contrast-detection system and the above algorithms can be in charge of fine adjustment or detecting and tracking patterns (e.g., faces, eyes, etc.).

When the shutter button is pressed, each pixel pair is combined to generate the image information, as if it were a single photodiode.

This type of approach works very well, for example, for video, for tracking objects. Once the thing we want to focus on is ‘hooked,’ it allows a relatively precise and fast-tracking throughout the entire scene since the ‘focus points’ are evenly distributed throughout the sensor.

The Dual Pixel system is not as fast as the traditional phase detection approach for photography (that of specialized SLR cameras), although these types of systems evolve with each generation of cameras.

We discussed the same reasons in the hybrid approach: the independent phase detector is optimized for this task. The physical separation of each pair of sensors makes triangulation easier.

One drawback of the system is the price. Building a Dual Pixel sensor is more expensive than building a traditional sensor or a hybrid sensor (Hybrid CMOS).

In the video, the Dual Pixel system behaves similarly to the generic hybrid system.

The phase-detection part gives the initial push to estimate where the scene’s main object is located, and the face, eye, or object tracking detection algorithms do the fine-tuning. The two systems: phase + contrast, are continually providing information to the camera.

Panasonic DFD focus

Panasonic cameras (starting with the Panasonic GH4) use a DFD system (Depth from Defocus) based on contrast detection.

As we saw in the corresponding section, contrast detection has the problem that the system does not know where to move the focusing lens or how much to impact it.

The fundamental contrast-detection algorithm analyzes the image as the focus moves until the contrast level (in the area we are focusing on) reaches a maximum. Let’s assume that 10-15 pictures of the scene are analyzed until the exact point of focus is found.

The DFD algorithm is based on the following:

  • The camera analyzes the first image and compares it with the next (with the focusing lens in another position).
  • The camera searches its database for the characterization of the lens it uses and from that information. It can calculate how far the focusing lens has to move by analyzing the images and how far the focusing lens has to move.
  • Once the focus lens is in position, fine-tuning is performed by trial and error (as in necessary contrast detection).

The Depth from Defocus system’s advantage is that the camera only has to analyze 4-5 images and the focus lens travel is much less in most cases. The lens makes two initial movements, a direct action towards the estimated area and then a couple of correction movements (compared to the 10-15 moves it would make in pure contrast detection).

That translates into shorter response times.

The downside of DFD is that it only works when the camera uses individual lenses: the ones that Panasonic has characterized, which are precisely its lenses.

DFD does not work with non-Panasonic lenses. And when a new lens comes out, the camera firmware needs to be updated so that the DFD system recognizes it.

With other lenses, the camera uses the base contrast-detection focusing system.

In the video, the DFD system is faster when making transitions between shots. For example, when focusing from a close object to a distant one.

But once the object is in focus (if we use continuous focus), the system has to periodically check that the distance between the object and the camera has not changed. And you have to minimize the effect of focus hunting (micro variations of focus) as much as possible.

Again, that balance makes DFD systems less responsive to the scene’s changes than cameras with hybrid focus systems.