Thursday, December 12, 2013

The SkyBox camera

Christmas (and Christmas shopping) is upon us, and I have a big review coming up, but I just can't help myself...

SkySat-1, from a local startup SkyBox Imaging, was launched on November 21 on a Russian Dnepr rocket, along with 31 other microsatellites and a package bolted to the 3rd stage.  They have a signal, the satellite is alive, and it has seen first light.  Yeehah!

These folks are using area-array sensors.  That's a radical choice, and I'd like to explain why.  For context, I'll start with a rough introduction to the usual way of making imaging satellites.

A traditional visible-band satellite, like the DubaiSat-2 that was launched along with SkySat-1, uses a pushbroom sensor, like this one from DALSA.  It has an array of 16,000 (swath) by 500 (track) pixels.
The "track" pixel direction is divided into multiple regions, which each handle one color, arranged like this:
Digital pixels are little photodiodes with an attached capacitor which stores charge accumulated by the exposure.  A CCD is a special kind of circuit that can shift a charge from one pixel's capacitor to the next.   CCDs are read by shifting the contexts of the entire array along the track direction, which in this second diagram would be to the right.  As each line is shifted into the readout line, it is very quickly shifted along the swath direction.  At multiple points along the swath there are "taps" where the charge stored is converted into a digital number which represents the brightness of the light on that pixel.

A pushbroom CCD is special in that it has a readout line for each color region.  And, a pushbroom CCD is used in a special way.  Rather than expose a steady image on the entire CCD for tens of milliseconds, a moving image is swept across the sensor in the track direction, and in synchrony the pixels are shifted in the same direction.

A pushbroom CCD can sweep out a much larger image than the size of the CCD.  Most photocopiers work this way.  The sensor is often the full width of the page, perhaps 9 inches wide, but just a fraction of an inch long.  To make an 8.5 x 11 inch image, either the page is scanned across the sensor (page feed), or the sensor is scanned across the page (flatbed).

In a satellite like DubaiSat-2, a telescope forms an image of some small portion of the earth on the CCD, and the satellite is flown so that the image sweeps across the CCD in the track direction.
Let's put some numbers on this thing.  If the CCD has 3.5 micron pixels like the DALSA sensor pictured, and the satellite is in an orbit 600 km up, and has a telescope with a focal length of 3 meters, then the pixels, projected back through that telescope to the ground, would be 70 cm on a side.  We call 70 cm the ground sample distance (GSD).  The telescope might have an aperture of 50cm, which is as big as the U.S. Defense Department will allow (although who knows if they can veto a design from Dubai launched on a Russian rocket).  If so, it has a relative aperture of f/6, which will resolve 3.5 micron pixels well with visible light, if diffraction limited.

The satellite is travelling at 7561 m/s in a north-south direction, but it's ground projection is moving under it at 6911 m/s, because the ground projection is closer to the center of the earth.  The Earth is also rotating underneath it at 400 m/s at 30 degrees north of the equator.  The combined relative velocity is 6922 m/s.  That's 9,900 pixels per second.  9,900 pixels/second x 16,000 pixel swath = 160 megapixels/second.  The signal chain from the taps in the CCD probably will not run at this speed well, so the sensor will need at least 4 taps per color region to get the analog to digital converters running at a more reasonable 40 MHz.  This is not a big problem.

A bigger problem is getting enough light.  If the CCD has 128 rows of pixels for one color, then the time for the image to slide across the column will be 13 milliseconds, and that's the effective exposure time.  If you are taking pictures of your kids outdoors in the sun, with a point&shoot with 3.5 micron pixels, 13 ms with an f/6 aperture is plenty of light.  Under a tree that'll still work.  From space, the blue sky (it's nearly the same blue looking both up and down) will be superposed on top of whatever picture we take, and images from shaded areas will get washed out.  More on this later.

Okay, back to SkySat-1.  The Skybox Imaging folks would like to shoot video of things, as well as imagery, and don't want to be dependent on a custom sensor.  So they are using standard area array sensors rather than pushbroom CCDs.

In order to shoot video of a spot on the ground, they have to rotate the satellite at almost 1 degree/second so that the telescope stays pointing at that one point on the ground.  If it flies directly over that spot, it will take about 90 seconds to go from 30 degrees off nadir in one direction to 30 degrees off in the other direction.  In theory, the satellite could shoot imagery this way as well, and that's fine for taking pictures of, ahem, targets.

A good chunk of the satellite imagery business, however, is about very large things, like crops in California's Central Valley.  To shoot something like that, you must cover a lot of area quickly and deal with motion blur, both things that a pushbroom sensor does well.

The image sliding across a pushbroom sensor does so continuously, but the pixel charges get shifted in a more discrete manner to avoid smearing them all together.  As a result, a pushbroom sensor necessarily sees about 1 pixel of motion blur in the track direction.  If SkySat-1 also has 0.7 meter pixels, and just stared straight down at the ground, then to have the same motion blur it would have to have a 93 microsecond exposure.  That is not enough time to make out a signal from the readout noise.

Most satellites use some kind of Cassegrain telescope, which has two mirrors.  It's possible to cancel the motion of the ground during the exposure by tilting the secondary mirror, generally with some kind of piezoelectric actuator.  This technique is used by the Visionmap A3 aerial survey camera.  It seems to me that it's a good match to SkyBox's light problem.  If the sensor is a interline transfer CCD, then it can expose pictures while the secondary mirror stabilizes the image, and cycle the mirror back while the image is read out.  Interline transfer CCDs make this possible because they expose the whole image array at the same time and then, before readout, shift the charges into a second set of shielded capacitors that do not accumulate charge from the photodiodes.

Let's put some numbers on this thing.  They'd want an interline transfer CCD that can store a lot of electrons in each pixel, and read them out fast.  The best thing I can find right now is the KAI-16070, which has 7.4 micron pixels that store up to 44,000 electrons.  They could use a 6 meter focal length F/12 Cassegrain, which would give them 74 cm GSD, and a ground velocity of 9,350 pixels/sec.

The CCD runs at 8 frames per second, so staring straight down the satellite will advance 865 m or 1170 pixels along the ground.  This CCD has a 4888 x 3256 pixel format, so we would expect 64% overlap in the forward direction.  This is plenty to align the frames to one another, but not enough to substantially improve signal-to-noise ratio (with stacking) or dynamic range (with alternating long and short exposures).

And this, by the way, is the point of this post.  Area array image sensors have seen a huge amount of work in the last 10 years, driven by the competitive and lucrative digital camera market.  16 megapixel interline CCDs with big pixels running at 8 frames per second have only been around for a couple of years at most.  If I ran this analysis with the area arrays of five years ago the numbers would come out junk.

Back to Skybox.  When they want video, they can have the CCD read out a 4 megapixel region of interest at 30 fps.  This will be easily big enough to fill a HDTV stream.

They'd want to expose for as long as possible.  I figure a 15 millisecond exposure ought to saturate the KAI-16070 pixels looking at a white paper sheet in full sun.  During that time the secondary mirror would have to tilt through 95 microradians, or about 20 seconds of arc for those of you who think in base-60.  Even this exposure will cause shiny objects like cars to bloom a little, any more and sidewalks and white roofs will saturate.

To get an idea of how hard it is to shoot things in the shade from orbit, consider that a perfectly white sheet exposed to the whole sky except the sun will be the same brightness as the sky.  A light grey object with 20% albedo shaded from half the sky will be just 10% of the brightness of the sky.  That means the satellite has to see a grey object through a veil 10 times brighter than the object.  If the whole blue sky is 15% as bright as the sun, our light grey object would generate around 660 electrons of signal, swimming in sqrt(7260)=85 electrons of noise.  That's a signal to noise ratio of 7.8:1, which actually sounds pretty good.  It's a little worse than what SLR makers consider minimum acceptable noise (SNR=10:1), but better than what cellphone camera makers consider minimum acceptable noise (SNR=5:1, I think).

But SNR values can't be directly compared, because you must correct for sharpness.  A camera might have really horrible SNR (like 1:1), but I can make the number better by just blurring out all the high spatial frequency components.  The measure of how much scene sharpness is preserved by the camera is MTF (stands for Modulation Transfer Function).  For reference, SLRs mounted on tripods with top-notch lenses generally have MTFs around 40% at their pixel spatial frequency.

In summary, sharpening can double the high-frequency MTF by reducing SNR by a factor of two.  Fancy denoise algorithms change this tradeoff a bit, by making assumptions about what is being looked at.  Typical assumptions are that edges are continuous and colors don't have as much contrast as intensity.

The atmosphere blurs things quite a bit on the way up, so visible-band satellites typically have around 7-10% MTF, even with nearly perfect optics.  If we do simple sharpening to get an image that looks like 40% MTF (like what we're used to from an SLR), that 20% albedo object in the shade will have SNR of around 2:1.  That's not a lot of signal -- you might see something in the noise, but you'll have to try pretty hard.

The bottom line is that recent, fast CCDs have made it possible to use area-array instead of pushbroom sensors for survey satellites.  SkyBox Imaging are the first ones to try this idea.  Noise and sharpness will be about as good as simple pushbroom sensors, which is to say that dull objects in full-sky shade won't really be visible, and everything brighter than that will.

[Updated] There are a lot of tricks to make pushbroom sensors work better than what I've presented here.

  • Most importantly, the sensor can have more rows, maybe 1000 instead of 128 for 8 times the sensitivity.  For a simple TDI sensor, that's going to require bigger pixels to store the larger amount of charge that will be accumulated.  But...
  • The sensor can have multiple readouts along each pixel column, e.g. readouts at rows 32, 96, 224, 480, 736, and 992.  The initial readouts give short exposures, which can see sunlit objects without accumulating huge numbers of photons.  Dedicated short exposure rows mean we can use small pixels, which store less charge.  Small pixels enable the use of sensors with more pixels.  Multiple long exposure readouts can be added together once digitized.  Before adding these long exposures, small amounts of diagonal image drift, which would otherwise cause blur, can be compensated with a single pixel or even half-pixel shift.

[Updated] I've moved the discussion of whether SkyBox was the first to use area arrays to the next post.