reading

The concepts of the ophthalmic and optometric devices presented in this book can only be understood with a well-founded knowledge in geometric and wave optics. Hence, we append this chapter to ensure a common knowledge base for all readers. The intention was not to present the content of an entire textbook of optics. However, the following sections cover all topics that are relevant for the discussions in the ophthalmic system parts of this book.

Although “light” is something we are familiar with in our daily lives, its nature is hard for us to imagine. Light is nothing we can directly touch or manipulate. We can only learn about its properties via its interactions with matter.

A first elaborate description of light was developed in the seventeenth century. Classical geometric optics (ray optics) describes the propagation of light through and around objects by rays. In this model, light behaves like a projectile (or particle) whose flight path is influenced by interactions with media. So, this model focuses on the location and direction of light rays, while the intrinsic properties of light are ignored.

But already in the seventeenth century interference and diffraction effects had been observed in experiments and could not be explained in terms of geometric optics. Scientists like Christiaan Huygens (1629–1659) proposed that light has a wave-like character. With this approach, the laws of reflection and refraction could be explained. Wave optics was further improved by Thomas Young (1773–1829) and Augustin Fresnel (1788–1827) who introduced the interference principle for superposition of waves.

In the nineteenth century, Michael Faraday (1791–1867) and James Maxwell (1831–1879) extended the wave model. They discovered that light is an electromagnetic wave. This had far-reaching consequences particularly for the explanation of light–matter interaction. In this way, for instance, it was possible to explain the wavelength dependence of refraction. By this theory, visible (VIS) light became essentially the same physical phenomenon as microwaves, infrared (IR) radiation, ultraviolet (UV) radiation, and X-rays. The only difference is the corresponding wavelength. In 1888, the electromagnetic wave theory was experimentally verified by Heinrich Hertz (1857–1894).

It is quite interesting that the question of whether light is a particle or a wave is a recurring topic in history. Within the framework of quantum theory, Albert Einstein (1879–1955) proposed that light must have a momentum like small “solid” particles (photons). This assumption was based on the photoelectric effect which was a puzzling experiment that seemed to contradict the classic wave concept. In the early twentieth century, it became evident that the apparent dichotomy of the particle and wave descriptions in the macroscopic world is removed on the atomic scale. This is also known as the “wave-particle duality”. Quantum optics reveals that an intuitive understanding of light’s nature is beyond our imagination.

Four different approaches therefore exist to describe the behavior of light in optical systems, that is, geometric, wave, electromagnetic, and quantum optics. Depending on the given problem or application, we have to choose the adequate formalism which is most intuitive and feasible to describe the considered problem or application. For almost all purposes, geometric optics (Section A.1) and wave optics (Section A.2) are sufficient to understand most topics in ophthalmology and optometry. However, photons play a major role in the context of lasers (Chapters 9, 10, and B).

A.1 Geometric Optics and Optical Imaging

In geometric optics, light is treated as a bundle of rays which travels in optical media and obeys a set of geometric laws. As an example, let us consider a transparent, flat glass slide which is embedded in air as shown in Figure A.1a. When a light beam is incident on the left air–glass interface at an angle γ, some of it is reflected at an angle of

while the rest enters the glass body. The incoming, transmitted, and reflected rays all lie in the plane of incidence. The ray which enters the body now also changes its direction (γ′ ≠ γ). This effect is referred to as refraction. At the other interface, the ray is again partly reflected and partly transmitted. The transmitted portion is deflected to the other side so that it travels again at the angle γ.

Figure A.1 (a) Reflection, propagation, and transmission of a light ray which is incident from air on a flat glass slide. The ray is deflected at the air–glass interface towards the normal of the glass surface due to refraction. When the ray exits the medium, it is again deflected, but this time away from the normal of the surface. (b) Let us consider a light source inside glass. For γ < γ_crit, the ray is again deflected as in (a). For γ = γ_crit, the ray is deflected such that it travels along the glass surface. If γ > γ_crit, we have total reflection at the air–glass interface, that is, the ray does not leave the glass volume anymore. Instead, it bounces back and forth between both interfaces.

A.1.1 Refraction and Dispersion

Why do incident rays change their direction at both air–glass interfaces? Obviously, it has something to do with the inherent properties of the medium. Here, Fermat’s¹⁾ principle gives a possible clue: light rays always choose the path which can be traversed in the least time. Keeping this in mind and looking again at Figure A.1a, we see that the light ray tries to minimize the traveling time in glass. Hence, glass seems to slow down the light rays.²⁾ As a consequence, the fastest route from point A to point B requires the angle of travel to be changed (in our example: γ′ < γ).

The highest speed of light is found in vacuum, where light is as fast as c₀ = 299 792 458 m/s. For all other media, no matter if they are gaseous, liquid, or solid, the speed decreases corresponding to the related refractive index n, that is, a dimensionless number greater than one. In an arbitrary medium with refractive index n_m, the speed of light is c_m = c₀/n_m. For example, we have c_glass = c₀/1.52 ≈ 2 × 10⁸ m/s in glass.

The angles of travel outside and inside a medium (γ and γ′, respectively) are directly related to the refractive index via Snell’s³⁾ law. For the general case of an arbitrary medium (primed parameters) embedded in a host medium, Snell’s law reads

the situation is similar to the example given in Figure A.1a. For γ = γ_{_crit}, the refracted ray is exactly tangent to the interface. And if the angle of incidence exceeds the critical angle, the ray can no longer enter the adjacent medium. In the example of Figure A.1b, the ray is kept inside the glass slide in that it is reflected back at both interfaces without “losing” intensity. This so-called total (internal) reflection is actually a lossless effect which is used for optical fibers to transfer light over long distances (Section 10.2.4.2).

At the interface of two different media, the refractive index determines not only the ray deflection, but also the fraction of light that is transmitted and reflected. For a normal incident light ray (γ = 0°) and neglected absorption (Section 9.1), the reflected fraction (reflectance) is given by

The relations (A4) and (A5) are referred to as the Fresnel equations.⁴⁾ A more general form of the Fresnel equations for oblique incidence can be found in Section A.2.1.4.

Summing up, refraction occurs at every interface between two media with different refractive indices. The refractive index is in turn a function of the wavelength (color) of the incident light, which is referred to as dispersion. This phenomenon is caused by interactions (scattering and absorption) of light with atoms or molecules and is thus specific for each material.

A well-known experiment to show the effect of dispersion is illustrated in Figure A.2, where a beam of white light shines onto a glass prism. Besides the overall refraction, the incident beam is split into the colors of the rainbow. Since the refractive index depends on the wavelength λ (Figure A.3), we have different refraction angles for each wavelength. As white light is composed of all visible colors, rays with different wavelengths are fanned out by the prism and eventually hit the wall at different positions. A medium whose refractive index decreases with increasing wavelengths (dn/dλ < 0) is said to have normal dispersion (as shown in Figure A.3), whereas the contrary case (dn/dλ > 0) is said to have anomalous dispersion.

Figure A.2 A typical experiment to demonstrate dispersion: white light which contains all color components of the visible spectrum is split into the colors of the rainbow by a glass prism.

Figure A.3 Refractive index of fused silica (SiO₂) in the near-infrared, visible, and ultraviolet spectral range (color spectrum is shown for reference). The refractive index n(λ) changes considerably from violet (e.g., at 0.4 μm = 400 nm) to near-infrared wavelengths (e.g., at 1.0 μm = 1000 nm). To determine the Abbe number of fused silica, n is measured for certain wavelengths: n(λ_F′) = 1.4638, n(λ_e) = 1.4601, n(λ_C′) = 1.4567. From Eq. (A6), we obtain the Abbe number v_silica = 64.8.

Shining the light of an arc lamp onto a prism does not lead to a continuous color fan. In this case, we rather see discrete spectral lines which are separated by totally dark regions. Some of these material-specific emission lines are used to define a measure for dispersion in the visible spectral range, that is, the so-called Abbe number⁵⁾

n_F′, n_e, and n_C′ correspond to the refractive indices of a considered medium at wavelengths λ_F′ = 480 nm, λ_e = 546 nm, and λ_C′ = 644 nm, respectively.⁶⁾ Following Eq. (A6), we see that media with low dispersion have high values of v. Abbe numbers are particularly used to classify different sorts of glass and other optically transparent materials ([1]; Problem PA.1).

A.1.2 Imaging by Spherical Surfaces

So far, we have only considered the case of refraction at a plane surface. The situation becomes a little more complicated if the surface is curved. This is very relevant in practice as most standard optical systems can be regarded as a sequence of curved surfaces. For the sake of simplicity, we will initially limit ourselves to monochromatic light, that is, light with one wavelength, so that dispersion can be ignored. In more concrete terms, we will trace light rays which are incident on an embedded spherical medium with a refractive index n′ > n, where n is the refractive index of the surrounding medium. A cross-section of the arrangement is shown in Figure A.4⁷⁾. The rays are emanated in all directions from object point Q on the left. Here, we will concentrate on just one of these rays which forms an angle γ with the optical axis (horizontal symmetry axis). When the ray hits the spherical surface at point A, it is refracted by obeying Snell’s law (A2) and crosses the optical axis again at point Q′. The spherical interface thus creates a point-to-point projection from Q to Q′ at a distance |s| + |s′|.

Figure A.4 Point-to-point projection by a spherical surface. A light ray is emanated from the on-axis object point Q at an angle γ. The considered space is divided by a spherical surface. On the left, the refractive index is n, while we have a different refractive index n′ > n on the right. An incident ray QA is thus refracted by obeying Snell’s law. At point Q′, the ray AQ′ crosses the optical axis again at an angle γ′. The center of curvature of the spherical surface is denoted as C′, and r′ is the radius. s is the horizontal distance from Q to the surface boundary (object distance), and s′ the distance from the boundary to Q′ (image distance). h is the vertical distance of the ray from the optical axis to the point of intersection A (ray height).

In the following, we will derive the basic equation which links the distances s and s′ of the point-to-point imaging to the refractive indices n, n′, and the radius of curvature r′ of the spherical surface. From the triangle Q′AC′ in Figure A.4, we see that

It is useful to express the angle χ′ in terms of χ. For this purpose, we will apply Snell’s law to point A. Although Snell’s law appears to be simple at first sight, the sine terms would make the following calculations quite complicated. For this reason, we will select a ray which travels close to the optical axis. This is the so-called paraxial approximation of small angles which allows substitution of sin γ and tan γ by γ. Analogous relations also hold for χ, χ′, γ′, and θ′. For these paraxial rays, we may use the simplified Snell’s law of refraction given by nχ ≈ n′χ′. The relation (A7) now becomes

Next, we retrieve image distance s′, object distance s, and height h from the angles. For this purpose, we use the paraxial approximations θ′ ≈ h/r′, γ ≈ – h/s, and γ′ ≈ h/s′ so that we obtain⁸⁾

For a given object distance s, the image distance s′ thus no longer depends on angle γ′ in the case of paraxial approximation. The important consequence of this result is that all paraxial rays emanated from Q intersect at the same point Q′. Point Q′ is referred to as the image point⁹⁾ (Problem PA.2).

A.1.2.1 Thin Lenses

In the next step, we add a second spherical surface to the system, as shown in Figure A.5. We assume that the centers of both spheres lie on the optical axis so that the whole system is rotationally symmetric. If the sum of radii |r₁| + |r₂| is greater than the center-to-center distance, the intersecting volume forms a positive, biconvex lens. We further assume that the lens is “thin” so that an incident ray emerges at about the same height at which it enters the lens. In terms of geometric design, we consider the thin lens as a flat refracting sheet S (green dashed line in Figure A.5). The imaging equation of a thin lens, also known as the lens maker’s equation (Problem PA.3), is given by

Figure A.5 Geometric formation of a lens by two spherical volumes. For |r₁| + |r₂| > C₁C₂, a positive (biconvex) lens is formed by the intersecting volume of both spheres (blue area). To observe refraction at the interface, the refractive index of the intersecting volume n′ must not be equal to the refractive index of the exterior volume n.

The derivation of Eq. (A12) can be performed either by using the matrix approach to be discussed in Section A.1.3 or by applying Eq. (A11) to both spherical surfaces (Problem PA.4).

The rules of imaging are not only valid for an on-axis object point. We can also apply them to extended objects as long as the paraxial approximation is valid. In this context, it is helpful to regard an extended object as a collection of object points which are distributed around the optical axis (off-axis) and all lie in one common object plane O. We define h_O as the object height. For the optical imaging, we select a single point from the object plane and trace all emanated paraxial rays. As before, the rays converge at a respective image point after passage through the lens. If we do this for all the other object points, we will recognize that the whole collection of image points lies in a common image plane I.¹⁰⁾ The largest distance of the image points from the optical axis is defined as the image size h′_I. On the whole, we replaced the former point-to-point by a plane-to-plane imaging. This means that we may see a sharp image of an extended object by placing a flat screen in the image plane (Figure A.6).

Figure A.6 Optical imaging of an extended object using a thin lens. For the geometric design, we consider two specific rays. Ray 1 travels parallel to the optical axis at a distance h_O (which is effectively the height of the considered object). After refraction by the lens, ray 1 crosses the optical axis on the image side at focal point F′. Ray 2 has the same origin as ray 1, but is obliquely incident on the lens surface. As a consequence, ray 2 passes through the center of the lens C and, simultaneously, crosses the optical axis without being deflected. Image plane I is determined by the geometric intersection of both rays.

For the optical calculation of extended objects, we select the outermost object point O₁ in Figure A.6 and trace the first ray which is parallel to the optical axis (i.e., ray 1). After refraction by the lens at A₁, the ray crosses the optical axis at focal point F′. The second ray starts from O₁ as well, but passes through the lens center C under oblique incidence without any deflection from the straight traveling path. The position of image plane I is then determined by the intersection of both rays. The geometric design allows the roles of object plane and image plane to be exchanged. This brings us to ray 3 (gray line in Figure A.6). The corresponding object-side focal point is represented by F.

As paraxial image formation by lenses is based on elementary geometric rules, one can easily translate the geometric design to corresponding imaging equations. A closer look at Figure A.6 reveals that the triangles images

are similar. This is because images

and its opposite angle images

are equal. In addition, the triangles images

are similar¹¹⁾ so that

f′ is focal length between lens center C and focal point F′. In ophthalmology and optometry, f′ is often replaced by the reciprocal refractive power images

, where n′ is the refractive index of the medium on the image side. Note that the refractive power has the dimension of inverse meters (diopters) images

We can also calculate the ratio between object and image size, which is referred to as the magnification

of the optical system. β is positive if both object and image are upright, and negative if the image is flipped upside down. For example, we see a flipped image in Figure A.6 which is smaller than the original object. In this case, the magnification β is a number between –1 and 0. The image is said to be real, since we could place a screen in image plane I to observe a sharp image of the object.

For some magnifying visual devices (magnifying loupes, telescopes, and so on), the object is placed between object-side focal point F and lens (Figure A.7). In contrast to Figure A.6, the parallel and focal rays diverge on the image side. Thus, no sharp image is formed there. But when the ray paths are extended in the reverse direction (dashed lines), an upright virtual image is obtained that appears to be located on the object side. In contrast to real images, virtual images can only be detected by using an optical imaging system (e.g., eye or camera).

Figure A.7 If the object is placed between object-side focal point F and lens, an upright virtual image is formed which is larger than the object (h′₁ > h_O).

A.1.2.2 Thick Lenses

At the beginning of our discussion about thin lenses (Section A.1.2.1), we assumed that all incident rays emerge from the lens at about the same height at which they entered it. But if the lens thickness L is comparable to its radii of curvature, that is, a “thick” lens, we have to take a closer look at the ray deflection at each interface. In Figure A.8a, we trace an incident ray which is parallel to the optical axis. This ray is refracted at point A by the left lens surface such that it emerges at point B. Refraction at the right lens surface changes the traveling path of the ray again so that it eventually crosses image-side focal point F′. In practice, we usually do not care about the exact ray path inside the lens. Instead, we are interested in the effective change in direction. Therefore, it is sufficient to use a simplified geometric design which gives the same results. For this purpose, the incoming ray path is extended in a forward direction (orange line in Figure A.8a) and the outgoing ray path in a backward direction (green line). The point at which both extended ray paths intersect lies on the image-side principal plane K′. At this plane, the incident parallel rays are “effectively” refracted to the focal point. Accordingly, if an incident light ray crosses the object-side focal point F at an oblique angle, it is “effectively” refracted at object-side principal plane K. Thus, it emerges parallel to the optical axis on the image side. For positive lenses, the positions of the corresponding principal points P and P′ are determined by the object- and image-side focal lengths f = O_oP – O₀F < 0 and f′ = O₀F′ – O₀P′, respectively. The imaging equation of a thick lens is given by

where L is the lens thickness, n′ the refractive index of the lens medium (usually glass), and n the refractive index of the exterior medium. r₁ and r₂ denote the object-side and image-side lens radii, respectively. For L = 0, Eq. (A16) passes into the lens maker’s equation (A12) (Problem PA.3).

Figure A.8 Imaging with a thick spherical lens. (a) Usually, refraction would lead to deflection of rays as shown for the red path. However, extending the ray path outside the lens results in the orange and green lines. Their points of intersection define the principal planes K and K′. By design, we may thus consider refraction as a ray deflection which occurs at principal planes. (b) Imaging diagram of an extended object by a thick lens.

A.l.2.3 Types of Lenses

So far, we have explored positive, biconvex lenses formed by the intersecting volume of two spheres. The added radii of both spheres are greater than the distance between sphere centers (Figure A.9a). For positive lenses, parallel incident light rays converge at focal point F′ (or F in the reverse direction). A similar behavior is also found for plano-convex and meniscus-shaped lenses. Due to their differing geometry and refractive power, the object-side and image-side principal planes K and K′ have to be shifted as shown in Figure A.9b,c. All of these types of positive lenses bend light rays towards the optical axis, and they are said to have a positive refractive power.

Figure A.9 Various types of positive and negative lenses. (a) Biconvex, (b) plano-convex, and (c) meniscus lenses refract incident light in basically the same manner. But due to their differing geometry, the position of the principal planes K and K′ must be adapted. (d) Biconcave and (e) plano-concave lenses.

In addition to positive lenses, concave or negative lenses (Figure A.9d,e) also exist. They can be designed by virtual connection of two spatially separated spheres (Figure A.10). Negative lenses cause an incident bundle of parallel rays to diverge after passage, and converging rays are made less convergent. Converging rays become parallel rays if their point of convergence coincides with the focal point of the lens. The lens equation (A14) can also be used for negative lenses when we choose a negative focal length f′ < 0. With a single negative lens, only virtual images can be formed.

Figure A.10 Geometric formation of a negative, biconcave lens by “virtual” connection of two spatially separated spherical volumes (|r₁| + |r₂| < C₁C₂). For thin negative lenses, refraction effectively occurs at the flat refracting sheet S (see also Figure A.5).

A.l.3 The Ray Tracing Approach to Paraxial Optical Systems

We will now continue with somewhat more complex optical systems like a set of thin lenses. For this purpose, it is sufficient to restrict ourselves to systems for which all optical components are centered at the optical axis. The imaging rules that we derived for a single lens can thus also be applied to more complex optical systems. For example, let us consider a simple microscope which consists of two positive lenses, as shown in Figure A.11. The lens next to the object forms an image that is either real or virtual. This image now serves as an object for the next lens, which forms another image. In general, we can thus simply repeat the imaging procedure for an arbitrary number of lenses.

Figure A.11 Magnified imaging by two lenses. Since the image plane of the right lens (eye piece) is on the object side, the resulting virtual image (here situated at optical infinity) is magnified. Continuous lines represent the traveling path of real light rays. Dashed lines represent “virtual” rays which are emanated from the virtual image focused by the observer.

Optical devices in ophthalmology normally require more than just one or two lenses. They may also contain curved mirrors or combinations of negative and positive lenses. A “brute-force” calculation of such optical systems may become a difficult and time-consuming undertaking. For this reason, more sophisticated methods are required which can also be used for computer-aided simulations. We already realized that every ray emanated by an extended object is characterized by two values: ray height h and angle γ. Every component of an optical system modifies these quantities. Thus, we will trace a ray which travels along the z axis and is incident on an arbitrary optical component (Figure A.12). If it is a paraxial ray, the sine terms in Snell’s law can be replaced by the angle itself (sin γ ≈ γ). In this case, h, h′, γ, and γ′ are directly related via

Here, we introduced the parameters A, B, C, and D which describe how the optical component acts on an incident ray.¹²⁾ In the next step, we combine Eqs. (A17) and (A18) in a matrix operator equation such that

Figure A.12 Underlying concept of the ray tracing approach. The optical path of a ray is parameterized by ray height h and the angle relative to the optical axis γ. A, B, C, and D are functions uniquely defined for optical components (Table A.1) which determine the deviations from a straight path. Adapted from [2].

Table A.1 ABCD matrices of several optical components. n′ denotes the refractive index of the refracting medium (e.g., lens material). n is the refractive index of the exterior medium through which the incident light rays travel, and n′′ is the refractive index of the exterior medium behind the optical system through which the outgoing rays travel. L denotes the slab or lens thickness, f′ the focal length, and r_i the radii of curvature (r > 0 for convex surfaces; r < 0 for concave surfaces). For media with quadratic radial index profiles: n(r) = n₀(1 – ε²r²), ε = const. Regarding the ABCD matrix of an afocal telescope, please refer to Problem PA.4.

We join all parameters to one ABCD matrix which contains all relevant information for the imaging process. ABCD matrices of various typical components are listed in Table A.1. If light rays travel backwards through an optical system (inversion of the direction), we simply have to invert the matrix. In such a case, the imaging relation is given by

For instructive purposes, let us derive the matrix expressions for a spherical surface and a thin lens just to understand the concept of matrix optics.

Example A.1

Spherical Surface When a ray hits a spherical surface, it does not change its height at the interface (h = const). The first conditional equation thus simply reads

(A21) images

However, the ray is deflected due to refraction. Referring to Figure A.4, we apply Snell’s law in paraxial approximation. With

(A22) images

(χ′ is pointing downwards) and

(A23) images

we obtain

(A24) images

Finally, we merge the results of Eqs. (A21) and (A24) to the matrix equation

(A25) images

For r′ = r, M corresponds to the matrix of the forth row in Table A.1.

Example A.2

Thin Lens It was already mentioned that a thin lens is nothing more than a combination of two spherical surfaces. As a consequence, we have to apply Eq. (A25) twice while taking the different curvatures into account. So, we have

(A26) images

With reference to Eqs. (A12) and (A14), we can express Eq. (A26) in a simpler way as

(A27) images

This expression corresponds to the fifth row in Table A.1.

Arbitrary optical systems To calculate the imaging of an arbitrary optical system, we proceed as for the thin lens example. This means that we decompose the whole system into individual optical elements for which we have a simple matrix expression (some of them given in Table A.1). The resulting arrangement of k elements can be regarded as a chain of matrix operations that must be multiplied like (Problem PA.4)

A.1.4 Aperture Stops, Field Stops, and Pupils

Optical systems always have at least one component that limits the solid angle at which rays may pass through. Up to now, we have assumed (without explicitly mentioning it) that the only limiting factor is the diameter of the used lenses or the inner diameter of their mounts. Rays which pass by a lens do not contribute to the imaging process and are thus ignored. Additionally, we can introduce other limiting elements like diaphragms which do not change the path of rays. We mainly distinguish between two types of diaphragm:

1. Aperture stop: Aperture stops are diaphragms introduced to an imaging system which narrow the solid angle of ray bundles emanated by the object. For this reason, they reduce the light intensity that arrives at the image plane.

2. Field stop: Field stops reduce the size of an image or object. They are used to define the boundary and field size (i.e., the visible image size).

The position in the optical system determines whether a diaphragm acts like an aperture or a field stop.¹³⁾ To find out which “role” a diaphragm takes up, it is useful to consider rays which are able to pass through the whole optical system. For an on-axis object point (O₀), the outermost limiting rays are referred to as marginal rays (violet lines in Figure A.13). Wherever the marginal rays cross the optical axis (y_mr = 0), an image is formed. When we place the diaphragm at or near this position, it “cuts off” the outer parts of the corresponding image, and thus acts as a field stop.¹⁴⁾ We may also regard the field stop as an “additional overlaid image” which effectively limits the projected field size of the original object.

Figure A.13 Geometry of optical imaging. The aperture stop blocks rays emanated from the object so that the resulting image brightness is reduced. The field stop limits the field size of the image. Entrance and exit pupils are the images of the aperture stop in object and image space, respectively. The marginal rays (violet) are characterized by the maximum emission angle of rays α and the maximum convergence angle α′. The marginal ray height at each position along the z axis is represented by y_mr. The chief ray (dashed red line) is characterized by the angle φ and its height y_cr at a certain z position. n and n′ are the refractive indices of the exterior media in object space (orange subspace) and image space (blue subspace), respectively. The primed parameters refer to the image space.

The diaphragm acts as an aperture stop if placed at or near a pupil plane (y_mr ≈ const), as it blocks some incident rays which are emanated by the object. Although the aperture stop regulates the amount of rays that can arrive at the image plane, the object is completely projected to the image side. Hence, we have no cut-off effect, but the brightness and resolution (as we will see in Section A.2.1.6) of the whole image is reduced instead.

The image of the aperture stop formed by the optical components between object plane and aperture stop is referred to as the entrance pupil. Its position and diameter determines the maximum emission angle α of rays from the on-axis object point O₀ which contributes to the optical imaging. The extensions of the object-side part of the marginal rays graze the edges of the entrance pupil (Figure A.13).¹⁵⁾ A measure for the range of ray angles, which can enter the optical system, is the numerical aperture

where n is the refractive index of the medium in the object space (yellow subspace in Figure A.13) into which the system is embedded.

The image of the aperture stop formed by the optical components between aperture stop and image plane is referred to as the exit pupil. It determines the maximum convergence angle α′ of rays in the image space (blue subspace in Figure A.13). The extension of the image-side part of the marginal ray, which converges at the on-axis image point I′₀, grazes the edges of the exit pupil. Summing up, the marginal rays define the diameters of entrance and exit pupils as well as the position of the images.

To determine the image size and the positions of entrance and exit pupils, we use the so-called chief ray (dashed red line in Figure A.13). The chief ray emerges from the off-axis object point O₁ and passes through the center of the aperture stop. The pupil centers are located at the intersections of the object- and image-side extensions of the chief ray with the optical axis, that is, points E and E′. The image size h′_I of an object is given by the chief ray height y_cr in the image plane.

If the optical system only consists of a thin lens (e.g., as shown in Figure A.6), the lens forms the aperture stop, and both entrance and exit pupils coincide with this aperture stop. In more complex optical systems (as shown in Figure A.13), however, the pupils are often different from the aperture stop and must not necessarily be physical diaphragms.

A.1.4.1 Vignetting

The aperture stop determines shape and size of the ray bundle emanated from on-axis object point O₀. However, when the bundle of rays is emanated from an off-axis object point, a part of it may be blocked by the lens mount or simply passes by the lenses and thus does not contribute to the imaging (Figure A.14). This “shading effect” is called vignetting and becomes more severe the larger the distance between object point and optical axis. Vignetting thus decreases the brightness toward the outer zone of an image. To reduce this effect in cascaded optical systems, the exit pupil of the first sub-system E′₁ has to match the entrance pupil of the second (downstream) sub-system E₂ as exactly as possible. If E′₁ and E₂ are not located at the same position on the optical axis and do not have the same diameter, some incident light rays cannot enter the second sub-system and the resulting image has a nonuniform brightness.

Figure A.14 The origin of vignetting in an optical system which consists of two lenses (L₁, L₂) and an aperture stop. The green lines represent the outermost rays which graze the edge of the aperture stop. A portion of the ray bundle emanated from off-axis object point O₁ is blocked. As a consequence, the outer zone of the image becomes darker.

Vignetting can be removed when no lens rims or mounts act as the limiting elements for any object point. Alternatively, we can also cut off the shadowed parts of the image with a field aperture. In the latter approach, we obviously crop the visible image, but the remaining part has a homogeneous brightness.

A.1.4.2 Helmholtz–Lagrange Invariant

For any optical system, invariants exist which are constant throughout the entire optical path. The Helmholtz–Lagrange invariant [2] is given by

n denotes the refractive index, α the marginal ray angle, and y_mr the marginal ray height. φ is the chief ray angle and y_cr the chief ray height at a selected position z along the optical axis (Figure A.13). To calculate images

, we can select any arbitrary plane along the optical axis which is not located inside the optical system (subspace in Figure A.13 framed by the gray dashed lines). In particular, we may select as one plane the object plane O and as another one the image plane I (dotted black lines in Figure A.13). When we take the possibly different refractive indices of the object-and image-side media into account (n ≠ n′), note that the height of the marginal ray y_mr is zero in both the object and image plane, and use y_cr = h_O, it follows from Eq. (A30) that

The primed and unprimed parameters refer to the image and object space (Figure A.13), respectively.

From Eq. (A31), we arrive at another invariant which is referred to as the throughput or étendue G. The étendue characterizes the amount of light passing through an optical system. As above, G is determined by the area of the entrance pupil times the solid angle subtended by the light source as seen from the pupil (see also [2, 3]). For any arbitrary optical system which fulfills Fermat’s principle (Section A.1.1), the étendue is constant in every pupil plane of the considered optical system¹⁶⁾ and proportional to the square of the Helmholtz–Lagrange invariant (Problem PA.5). So, we have

A.1.5 Limitations of the Paraxial Beam Approximation

One of the first assumptions in this chapter was the paraxial approximation for small refraction angles: sin γ ≈ tan γ ≈ γ. Here, we assumed that rays emanating from an object point travel close to the optical axis. When considering the imaging behavior of a positive lens, we realized that all paraxial, monochromatic rays converge at one single image point (Figure A.15a). In this case, a sharp image of the object point was obtained. But what happens with rays for which the paraxial approximation is not valid anymore? So far, we have simply ignored the nonparaxial rays. But in reality, they are usually present as well. For example, Figure A.15b shows how nonparaxial rays from an on-axis object point converge at different points after passage through the lens. Rays at different angles thus form different image points along the optical axis. Figure A.15c shows that rays from an off-axis object point form different image points in the image plane. As we do not have a defined point-to-point projection in both cases, the image becomes blurred. Any deviation from the ideal paraxial case leads to so-called aberrations. All these facts also hold for every point of an extended object.

Figure A.15 Schematics of (a) ideal point-to-point imaging, (b) aberrated imaging due to a wide opening angle of incident monochromatic rays, and (c) aberrated imaging due to oblique incidence of monochromatic rays from an off-axis object point. The graphics show the limitations of the paraxial approximation which are usually assumed in geometric optics. Deviations from the ideal case must always been considered for real optical systems.

If aberrations are present in an optical system, the image of an object point looks “spread out”. To quantify this behavior, we may use the point-spread function PSF, which is the image of an object point formed by a given optical system. A full understanding of the PSF is, however, only possible in the context of wave optics, as diffraction effects have to be considered for image formation (Section A.2.1.6). For now, we may think of this in a simplified manner in that we regard the intensity distribution of a point image as the density of rays homogeneously emanated from the object point into any solid angle passing through the image plane. Each ray corresponds to the same amount of energy. This concept is often called geometric spot. When diffraction effects can be neglected, the geometric spot is similar to the PSF. We can use it at this stage to get an understanding of the concept of aberrations.¹⁷⁾

A.1.6 Aberrations

In practice, we often want to image a given object as accurately as possible. Correction of ray aberrations is thus an extremely important topic for the development of optical systems and devices. Since each type of aberration can have a positive or negative sign, it is in principle possible to cancel out all kinds of deviation from the ideal image with a suitable arrangement of optical components. Nevertheless, a lot of effort and experience is needed to optimize the image quality of optical systems to a sufficient degree.

Before we go through the maths of aberration theory, it is instructive to get a quick overview of the imaging consequences of the most relevant types of aberration.

A.1.6.1 Spherical Aberration

We consider a simple optical system which consists of an on-axis object point and a positive (biconvex) lens, as depicted in Figure A.16a. For now, we do not restrict ourselves to paraxial rays. We also trace rays which pass through the outer zone of the lens. When Snell’s law is applied to all rays without any approximation, rays passing through the outer zone of the lens intersect with the optical axis further to the left than the paraxial rays. As refraction depends on the angle of incidence γ, no unique image point exists. The concentric blurring is called spherical aberration.

Figure A.16 The origin of spherical aberration for a single lens. (a) Light rays refracted in the outer zone of the lens have the shortest focal length, and rays refracted close to the optical axis (dotted line) are focused farther away from the lens. (b) Point-spread function of an optical system with spherical aberration in the plane of axis intersection of paraxial rays. Note that diffraction effects are taken into account in this image. So, interference patterns are also included which would not appear in the geometric spot image.

In Figure A.16b, corresponding images of the point object (i.e., the point-spread functions) are depicted for the object plane and the plane of axis intersection of paraxial rays. Since all rays are centered at the optical axis, the PSF is rotationally symmetric.

From this, we conclude that spherical aberration is particularly strongly pronounced for outer ring illumination of lenses with short focal lengths, such as microscope lenses and focusing systems. For correction of spherical aberrations, among other options, several positive and negative lenses are combined in a suitable manner, or aspheric lenses are used.

A.1.6.2 Coma

Let us look at another imaging aberration called coma, which appears only for off-axis object points or if components of an optical system are misaligned. The ray diagram in Figure A.17a shows how obliquely incident rays are refracted by a positive lens. Rays from an off-axis point passing through a particular zone of the lens intersect the image plane in a circle. The size and lateral shift of these circles increases with the diameter of the zone. Thus, the rings produce the characteristic coma tail (see picture of “image plane” in Figure A.17b).¹⁸⁾

Figure A.17 The origin of coma aberration. (a) Ray diagram of an off-axis object point. Due to the oblique angle of incidence, the image of an object point is deformed to a comet tail as is shown for the point-spread function in (b). (b) Point-spread function of an optical system with coma aberration in the image plane. Diffraction effects are also included which would not appear in the geometric spot image.

In complex optical systems, off-axis coma can be compensated by a suitable optical design. Optical systems are called aplanatic if they are free of spherical aberration and corrected for coma (at least for near-axis points).

A.l.6.3 Astigmatism

If two rays emanating from an off-axis object point pass through a lens at different angles, they will “see” different effective radii of the surface curvature. This already occurs in thin ray bundles. As a consequence, we find two different focal positions for the sagittal (xz) planes and meridional/tangential (yz) planes (Figure A.18) which, in turn, leads to a blurred image. Between the sagittal and meridional image planes, we have a disk of least confusion at which the deviations from the ideal image in vertical and horizontal directions are best compensated for. However, the resolution is worse than in the case of ideal imaging, since all fine object details are softened.

Figure A.18 The origin of astigmatism. Light rays which travel in the meridional (red) and sagittal (green) planes are refracted differently. Both sets of rays thus intersect at different positions (i.e., the meridional and sagittal foci) and produce elongated spot shapes ranging from linear to elliptical. In between, the geometric spot recovers a circular geometry. This is why it is called the “disk of least confusion”.

In rotationally symmetric systems, astigmatism can only occur for off-axis object points, as it is an effect of broken symmetry. Systems with asymmetric or off-axis optical components can also have astigmatic errors for on-axis object points. Logically, aspheric, toric, or any other free-form surfaces exhibit an even more pronounced astigmatism.

A.1.6.4 Field Curvature

As we can see in Figure A.19, a positive lens projects extended objects onto a curved surface. With a flat screen, we could therefore scan between the image planes formed by paraxial and coma rays to see either sharp outer zones or a sharp center. An example is shown in the insets on the right.

Figure A.19 The origin of field curvature: We place an ideal “point-imaging” lens at a distance f′ from a screen, where f′ is the image-side focal length of the lens. In this case, image points near the optical axis will be located in plane 2. However, image points from rays further away from the optical axis will be formed in front of the focal plane. The z position of the image point drops off by the cosine of angle α. A sharp image can thus only be obtained on curved surfaces, which is often technically demanding. α_U and α_Q are the field angles related to the object points U and Q, respectively. If a planar photodetector is placed at plane 1 (red), we will observe an image that is sharp near the edges (see image with red frame), since off-axis rays are focused. If the photodetector is placed at plane 2 (blue), the central part of the image will become sharp (see image with blue frame).

Departures of an ideal image due to field curvature are quite difficult to eliminate and even more severe at low magnifications. To correct this aberration, we need combinations of lenses which add positive and negative refractive powers to flatten the image field. As a consequence of the design principles, in photographic systems, positive lenses with a larger diameter and high refractive power but low refractive index, as well as negative lenses with a smaller diameter are used. For stereo microscopes, field curvature is also an important issue (Section 6.2). The simplest solution would be to capture the image on a curved screen. Unfortunately, this is often not feasible for technical systems.

A.1.6.5 Distortion

Distortion is the only type of aberration which is not related to image blurring. It rather leads to a change of magnification for extended images (of extended objects) with increasing distance of the image point from the optical axis. To understand the origin of distortion, we consider an extended object which is imaged by a thin lens (Figure A.20). In addition, we place a circular aperture stop in front of the lens. The position of each image point is determined by the chief ray that passes through the center of the aperture stop. When the aperture stop touches the lens, the chief ray passes through the lens without any deflection, and the image shows no distortion at all (orthoscopic system). We now place the aperture stop between lens and object plane, as depicted in Figure A.20a. The chief ray again crosses the center of the aperture stop. But this time, it passes through the outer zone of the lens. The image size (i.e., the magnification) thus decreases in a “nonlinear” manner with distance from the optical axis in that the image corners decrease more than the central part. In this effect called barrel distortion, straight lines of an object appear bent in the image. When the aperture stop is placed behind the lens on the image side, we see the contrary effect (Figure A.20b). The image size increases with distance from the optical axis, that is, the magnification is larger toward the image corners. This is called pincushion distortion. In contrast to position, the diameter of the aperture stop does not influence distortion, since the chief ray does not change its path when we make the aperture smaller or larger.

Figure A.20 The origin of image distortion. (a) Barrel distortion which leads to the well-known fish-eye effect: a magnification of the image center. (b) Pincushion distortion which leads to a relative magnification at the edges of the image.

Optical component	ABCD matrix
Free space (propagation)
Slab (propagation)
Flat surface (refraction)
Spherical surface (refraction)
Thin lens (refraction)
Thick lens (refraction)
Thick lens (embedded between two different media)
Quadratic radial index profile
Planar mirror (reflection)
Spherical mirror (reflection)

A

Basics of Optics