Recently I announced the possible move from 32-bit AVR to Atmel’s SAM3U ARM Cortex-M3 series. Soon after, I tweeted that I’m sticking with 32-bit AVR and mentioned that I would write a blog post explaining what happened. As promised, here is the post. Fair warning: long post is long.
Way back when I first started looking around for microcontrollers to do the isostick project I had a few requirements:
- High-Speed USB
- SD Card controller
- Internal program memory
- Enough internal SRAM to get the job done
The HS USB was needed for reasonable transfer speeds, as was the SD Card controller (bit-banging SD or using the SPI interface is insufficient). Internal program memory and SRAM reduces overall part count and board size, so I can fit it all on a conveniently tiny board.
It turns out finding a combination such as this was quite difficult at the time (though this is becoming easier with time). First I came across the now-obsolete LPC2888 from NXP, a spiffy little chip with 1MegaByte of internal flash — MASSIVE! Unfortunately the price tag was equally massive, so I set it aside and kept searching.
Next I came across the SAM3U series of Cortex-M3 microcontrollers from Atmel. They seemed great, meeting all my requirements and more! Alas, they were just recently announced at the time and weren’t available other than samples. Knowing that it may be a very long time before they’re in mass production, I set them aside as well.
While perusing Atmel’s website for other suitable microcontrollers, I recalled their 32-bit AVR AP7 series, which I thought may meet my requirements. Atmel had discontinued the AP7 series, although it lacked internal flash so it was off the list anyhow. Poking around the 32-bit AVR section, I noticed the UC3A3 series with HS USB and SD Card controllers. They had internal flash, an abundance of SRAM, and I knew if they were anything like the 8-bit AVR series they would be easy to develop for, too!
The next few days were spent digging around the interwebs for alternatives; not because I didn’t like the AVR, but simply because I wanted to make an informed decision. There were dozens of possible contenders, but most were missing the crucial combination of internal flash and HS USB.
There’s an interesting divide in the embedded micro world: microcontrollers versus microprocessors. A microprocessor historically is a CPU and a simple bus interface. Today, however, it is more common in the embedded arena to find a CPU surrounded by a plethora of amazingly useful peripherals and all sorts of fun interfaces, all in the same package. So, what differentiates a microcontroller from a microprocessor today? Internal flash and SRAM seem to be the delta, based on my own experience. Devices billed as microcontrollers tend to run at lower clock speeds and focus less on complex caching schemes and multimedia/SIMD instructions and instead put the flash and SRAM right on the die and run them at or near CPU clock rate. They compromise clock speed in exchange for a simpler device and lower external part count; a true System-on-Chip in many cases.
Getting back to the tale of part selection, I had settled on the 32-bit AVR. The feature set is impressive for such a small device. The cost is higher than competing ARM devices, but they generally lacked HS USB and so they weren’t an option for me. I set out learning the SD protocol, SCSI SPC SBC and MMC, FAT Filesystem, et al. Over the next few months much code was written!
The AVR32 Studio development environment was pretty amazing. I’m not a huge fan of Eclipse (which AVR32 Studio is built upon), but it certainly has features that made the job easier. Atmel’s integration of their libraries and programmers was impressive to say the least, and the debugger was fantastic.
Fast-forward to a few months ago: AVR Studio 5 goes into beta, and wow what an amazing application that is. It is built upon Visual Studio, so *nix folk will need to try wine or run a Virtual Machine. That said, it’s entirely worth the effort to get this running if you’re not a Windows user — AVR Studio 5 is simply awesome. It’s hard to believe it’s free given how great it is.
Fast-forward a little further: the SAM3U series goes into production. It was one of my first choices, and I wanted to go back and try it for a number of reasons. Primarily speed: I want to deliver a good experience to the end user, the faster the better. A week later a SAM3U-EK Eval Kit arrived at my doorstep. Porting of the code began. Admittedly I was disappointed with the complexity of the USB Mass Storage stack Atmel provided. The SD stack was equally spaghettified, with lots of asynchronous code and layer upon layer of oniony state machines. If you were doing time-critical/realtime system design, this might be nice — it tries as best it can to avoid monopolizing the CPU for any amount of time, and I’m sure it does that well.
The tools: I learned a very hard lesson when trying the SAM3U. I’m spoiled. AVR Studio 5 and even AVR32 Studio have seriously skewed my perception of other embedded IDEs. Atollic TrueSTUDIO seemed pretty good, almost on par with the old AVR32 Studio, but I couldn’t afford to purchase a license. ARM/Keil have their own offerings but again my research showed insane pricing. Eventually I settled on using YAGARTO / GNU toolchain with vanilla Eclipse. After much frustration and heads colliding with desks, it was working.
The port: In lieu of the Atmel-supplied SAM3U USB Mass Storage and SD libraries, I ported over the AVR Software Framework (ASF) libs. The SAM3U USB libs stayed in place and I spliced the ASF’s Mass Storage drivers on top of them. I did a complete port of the ASF’s SD/MMC libs including the low-level MCI portion, because the MCI silicon used in the SAM3U and UC3A3 is largely the same (the VERSION register shows the SAM3U is a variant of an older revision than that used in UC3A3, if I recall correctly).
By porting over the ASF’s Mass Storage drivers, my SCSI stack fell right into place. Likewise, by porting the ASF’s SD/MMC services I was able to drop my filesystem code in easily. After patching up a few spots where I made assumptions about endianness (32-bit AVR is Big Endian, whereas SAM3U is Little Endian), I had things up and running.
The problem: Testing, testing testing. I was seeing relatively slow read speeds. The SAM3U’s MCI can run at up to half the CPU clock. The CPU clock is 96MHz, so the MCI is 48MHz. The SD bus transfers 4bits/cycle, so 4bits/cycle * 48Mcycle/second * 1byte/8bits = 24MByte/s. There’s a bit of overhead to account for, but I should realistically have seen speeds in the ballpark of 20MB/s. Instead I was seeing closer to 15MB/s. For reference, the 32-bit AVR UC3A3′s MCI, running at the top speed of 33MHz, has a theoretical transfer rate of 33Mcycle/second * 4bits/cycle * 1byte/8bits = 16.5MB/s, and I see sustained throughput in the area of 12.5MB/s. The discrepancy here is likely due to the latency of issuing the command over USB (which is half-duplex), parsing that, issuing a command to the SD card, and so on.
It dawned on me that I had only been testing with a single card, so I pulled out some others I had, and behold! One card did in fact achieve around 18MB/s, the others hovered between 13MB/s and 15MB/s. Testing with other commercial readers, indeed it appeared to be the cards that were at fault. Some Googling revealed this seems to be commonplace. It’s also interesting to note that the microSD cards I have that are of Class 2 and 4 actually perform better for reading and writing than the Class 6 card I have, though the Class 6 does meet its specification. Go figure!
The decision: There were other subtle bugs in the SAM3U port (it was, after all, quickly hacked together over a week or two). So I decided to go with 32-bit AVR UC3A3, based on: the real speed gain of the SAM3U was much less than predicted, the subtle bugs in the port that had marginal benefit now, my deep familiarity with the 32-bit AVR UC3 series, and finally the amazing ease of use and integration of AVR Studio 5.
Phew, this has been a long post. I hope this has been an interesting or at least informative insight into the decision-making that went into designing the isostick.