This article explains:
- What paraxial rays are
- What parabasal rays are
- Differences between them that can cause confusion
The article is accompanied by a ZIP archive containing the samples used and a macro. This can be downloaded from the final page of the article.
Authored By: Mark Nicholson
What are paraxial rays?
Paraxial optics is ray-tracing performed in the limit of very small ray angles and heights. It allows us to make a number of simplifying assumptions that makes the arithmetic of ray-tracing considerably easier.
Assumption 1 is to Snell's Law itself. When refracting from one material into another, the celebrated equation is
nsinq = n'sinq'
where unprimed quantities are before refraction and primed are after. For small angles sinq @ q and so Snell's Law can be written
nq = n'q'
This of course was an enormous relief in the days before modern calculators and computers.
Many definitions in optics are based upon this assumption of linearity, and leads to the term first-order optics. Aberrations are third-order and higher deviations from this linearity, because as q gets larger, sinq @ q -q3/3! + q5/5!- etc. The paraxial properties of optical systems are often considered the properties the system has in the absence of aberrations.Assumption 2 is that, as the ray height on the surface is small, we can ignore the curvature of surfaces and instead trace rays between flat surfaces of equivalent power. The power of a surface of curvature C between two indices n and n' is:
j = (n'-n).C
and by ignoring the curvature for ray-intercept purposes we are saved the task of computing the exact ray-surface intercept point.
Assumption 3 is that the tangent of the ray angle (the ray slope) may be replaced by the ray angle. This assumption may not be obvious, but it is fundamental. Consider a paraxial ray being traced between two flat surfaces , as shown below. The ray has an initial height y on the first surface and has y- and z- direction cosines {m, n}. Its height y' on the next surface is given by:
 | y' = y + tanq. t = y + (m/n).t @ y + q.t |
because not only does sinq @ q but tanq @ q also. This has a fundamental consequence which is sometimes missed: the slope of a paraxial ray is the same as its angle.
Clearly paraxial optics introduces major simplifications to the calculation of ray-tracing, but it would be a mistake to consider paraxial optics as just a computational device, of no consequence now we have calculators and computers. Paraxial optics represents the limiting properties of rotationally symmetric systems comprised of spherical surfaces. However, parabasal rays are more general and more useful.
What are parabasal rays?
Parabasal rays are real rays that make a small angle to the chief ray. They are "real" in the sense that the full form of Snell's Law is used, so that the ray interacts with the real surface curvature and not a plane of equivalent power, and that no approximations are made in the ray-tracing.
Parabasal rays therefore lose all the computational advantages of paraxial rays, but better represent the limiting performance of a system as the aperture goes to zero. In particular, they allow surfaces to be tilted or decentered, to be non-rotationally symmetric, to be diffractive, to be gradient index, etc..
Many calculations require a paraxial reference, as a reference against which the real rays are compared. To ensure that these features work properly, even in systems not well described by first-order optics, Zemax uses parabasal rays to compute the limiting properties of the system as the aperture goes to zero.
However, applying paraxial theory to real lenses often causes confusion. To demonstrate this, we are going to trace paraxial rays in a real optical system. We will also trace a parabasal ray, so you can see precisely how the parabasal calculation works, and why it is generally superior.
The file real system.zmx is supplied in the ZIP file available from the last page of this article. It shows a microscope objective working at large NA. It is highly optimized and diffraction limited. The curvature of the final surface (drawn in red) is controlled by a marginal ray angle solve that forces the system to have a "marginal ray angle" of -0.5.
Image Space numerical aperture (ISNA) is defined as "the index of image space times the sine of the angle between the paraxial on-axis chief ray and the paraxial on-axis +y marginal ray calculated at the defined conjugates for the primary wavelength." 1 Now the image space index is 1, so its tempting to think that the ISNA should be sin(0.5) = 0.479. Zemax however computes it as 0.447. Why?
How are first-order optical properties calculated?
Remember assumption 3: tanq @ q, so the paraxial ray angle is replaced by the ray slope. Now consider this macro:

This gives the result:

As the marginal ray angle set by the solve goes to zero, the paraxial marginal ray angle and tangent tend to each other. At this large marginal ray angle however, they are quite different.
To trace parabasal rays, trace a real ray very close to the chief ray and scale it to the desired pupil coordinate:

This gives

It can be seen that it is the parabasal marginal ray angle, or tangent of the paraxial marginal ray angle, that is set by the solve.
Summary and References
When using first-order optical parameters, take great care of definitions. Some, like ISNA, are linked to a purely paraxial definition, and can be misleading if the optical system is not well described by paraxial optics. The underlying assumptions of paraxial optics are that the ray makes a small angle and small height with respect to the chief ray, so that
- Snell's Law can be replaced with its linear approximation
- The surface shape can be ignored, and a flat surface of equivalent power is used instead
- The ray slope is equivalent to the ray angle
These approximations are all made to make the numerical computation easier, but at the cost of generality. Parabasal rays are real rays that satisfy the paraxial condition, i.e. that they make a small angle and have small height with respect to the chief ray, but are otherwise traced as normal.
In summary, paraxial ray data is computed using first order approximations to the surface power for tracing rays, while parabasal rays are real, exact ray traces close to a chief or reference ray. Most paraxial data, such as EFL, F/#, and magnification, use paraxial rays and the data is invalid if the optical system is not well described by the vertex power of every surface.
Most analysis features in Zemax use parabasal rays, to allow these features to work with a greater range of optical systems, including those with optical surfaces not well described solely by their vertex surface power.
References
1. Zemax Users Guide, Chapter 3, "Conventions and Definitions"
Further Reading
Introduction to Lens Design, With Practical Zemax Examples, GEARY, Joseph M, Willmann-Bell Inc
Practical Computer-Aided Lens Design, SMITH, Gregory Hallock, Willman-Bell Inc