Wrapping up the Embedded Online Conference: Q&A

For a lot of 2020, we’ve been talking about avoiding end-of-life from NAND Correctable Errors. Recently, I spoke about this very topic at the Embedded Online Conference, where I got to digitally interact with many of you, and received your questions. For those not up to speed on the entire topic, please feel free to see the whitepaper we produced here. This topic brought up some interesting questions that I think warrant a little more discussion and digging.

All about the firmware

Perhaps the most common question was, “Where is the error management actually being handled?” For an example project – an ARM single board computer running Linux, with ext3 file systems on both microSD and eMMC – the answer starts with the firmware. This is special code written to work with the NAND flash media and controller. On Linux, there are also drivers to connect that firmware to a standard block device layer, allowing the developer to use block tools like encryption.

While error management is handled by the firmware, the file system can make requests which make that management much easier on the media, adding lifetime to the design. In this case, the interface used is known as Trim or Discard – a notification from the file system that blocks are no longer being used. Developers can use flash storage with the Trim or Discard notifications turned off, and they may see higher short-term performance – but both long term performance and media lifetime will suffer.

Handling errors on flash media designs

Another question I received was related to special flash media designs that contain a one-time programmable (OTP) section. This sort of read-only area can be used for system firmware or default configuration settings. Even that use case does not mean it is impossible for bit errors to occur there. If the OTP section is provided by the vendor (and their firmware), they may have a contingency to handle the situation – reprogramming in place while maintaining power. This is a question worth asking. If the OTP section is more of a design choice by the development team, I would suggest working with the vendor and a flash software team to make sure errors are properly handled. In such cases, optimized and tailored support is crucial. Our team at Tuxera offers design review services which may be helpful.

Some designs however use flash media that doesn’t have firmware. We refer to this as “raw flash”, and on Linux that can mean using a flash file system, such as YAFFS, JFFS2 or UBIFS. This software must include the error handling software which decides whether to ignore a bit error for now, or correct errors by relocating the data. Balancing this choice is dependent on use case and desired lifetime, and it’s something I discuss in our whitepaper. Unfortunately, the Linux flash file systems relocate the data on the first bit error, which can reduce lifetime considerably. This was a good choice when the NAND controllers could only handle error correction on 4 bits of data, but modern controllers can perform bit correction on 40 or more bits per NAND block.

Tuxera’s FlashFX Tera is a Linux solution which can handle these situations with ease. To learn more about it, click here.

Final thoughts

I’ve really appreciated getting to answer questions and discuss file systems with other enthusiasts in the Embedded Online Conference. Later this month, I’ll be speaking on the topic of automotive security at GENIVI AMM. It will be another great opportunity to talk to you about embedded software – this time on the topic of automotive safety. I’m looking forward to the questions and comments from all of you – perspectives that I’m sure will have me thinking about data storage in a new way.

Let us help you solve your data storage challenges.


The Embedded Online Conference – ongoing tech seminars a click away

Just last month, I spoke at the fully virtual Embedded Online Conference. Avoiding end of life from NAND correctable errors is a topic I’ve covered in the past, and it's still just as relevant when it comes to flash memory lifetime.

But just how did I end up speaking at the Embedded Online Conference in the first place?

I was on the train from Frankfurt to Nuremberg for Embedded World 2019, where I was speaking on a couple of topics, and manning the tradeshow booth. I pulled out my laptop to get a little work done, and noticed the gentleman across from me was doing the same. We got to chatting, and I found out that he, Jacob Beningo, also worked in the embedded systems industry, and was looking forward to the three-day show.

Jacob was a consultant, and pitched an interesting idea about an online conference. His idea was that attendees could go to virtual sessions, handle questions through a forum interface, and there would even be a virtual "trade show" floor with product demonstrations. I was definitely interested, and it turns out Jacob was ahead of his time.

Still live, still connected

Surveying my inbox, it looks like the remaining tradeshows this year are going virtual. Fortunately, the folks at Embedded Online Conference had a head-start. They put together a really nice site, with presentations "going live" at particular times. These sessions (and the show) will remain live through July – so you can watch the talks at your own pace, and leave questions and comments too. What's more, there is even a healthy discount on the registration page if you have been furloughed or laid off because of the Coronavirus – this is a great opportunity for training!

I've had some great questions from my talk, and I'm already thinking hard to come up with topics for next year's conference. Will I see you there?

Visit the Embedded Online Conference site


Help! Why are my embedded devices failing?

When devices fail, the problems can be numerous. In conversations with the embedded OEMs we work with, a common issue affects almost every manufacturer – the cost of diagnosing and fixing the causes of field failure. This impacts time-to-market and pulls resources away from development, to be used instead for field diagnostics and post-mortem analysis. This issue is especially relevant for the following reasons:

  1. The need for defect prevention during field operations: The high degree of reliability required for protecting critical data dictates that devices must not fail. To ensure that devices are wear-fail-safe, manufacturers are required to run extensive tests for a range of user scenarios so as to safeguard against edge cases. The analysis of test results can be a daunting task due to several interfaces between hardware, software, and application layers. Hence, there is a need to continuously track these interactions, so that during a failure, any difference in the interactions can be discovered and corrected.
  2. Vulnerability of device to wear-related failures: As flash media continues to increase in density and complexity, it’s also becoming more vulnerable to wear-related failures. With the shrinking lithography comes increased ECC requirements, and the move to more bits/cell. With this also comes a concern that what was written to the disk may not in fact be what is read off the disk. However, most applications assume that the data written to the file system will be completely accurate when read back. If the application does not fully validate the data read, there may be errors in the data that cause the application to fail, hang or just misbehave. These complications require checks to validate data read as against the data written, so as to prevent device failures due to data corruption.
  3. Complexity of hardware and software integration: The complex nature of hardware and software integration within embedded devices makes finding the cause of failures a painstaking job, one that requires coordination between several hardware and software vendors. For this reason, it often takes OEMs days to investigate causes at the file system layer alone. Problems below that layer can entail more extensive testing and involve multiple vendors. Log messages can help manufacturers pinpoint the location of failure so that the correct vendor can be notified.

This ability to pinpoint the cause of failure is especially helpful when an OEM is:

    • Troubleshooting during the manufacturing and testing process to make sure that their devices do not fail for the given user scenarios.
    • Doing post-mortem analysis on parts returned from their customers, in order to understand the reasons for failures, and possible solutions.
    • Required to maintain a log of interactions between the various parts of the device, for future assistance with failure prevention or optimization.

Identifying the causes and costs of field failure is one thing, but what solutions can OEMs turn to in order to prevent these issues in the first place?

Fighting field failure with transactional file systems

Thankfully, various file systems solutions exist for safeguarding critical data. FAT remains a simple and robust option with decent performance. Unfortunately, it isn’t able to provide the degree of data protection or performance that is sometimes needed. In safety-critical industries like automotive, aerospace, and industrial, basic file systems like FAT are often unable to meet the needed performance and reliability.

Transactional file systems like Tuxera’s own Reliance Edge offer a level of reliability, control, and performance for data that is simply too vital to be lost or corrupted. One of the key features of Reliance Edge is that it never overwrites live data, ensuring a backup version of that data remains safe and sound. This helps preserve user data in the event of power loss.

In the video below, I demonstrate the performance and data preservation differences between a FAT driver and Reliance Edge.


Final thoughts

Correctly finding and identifying the cause of field failures is the first step in tackling them. The next step is choosing the right solution – one that’s optimized to secure your critical data specifically in case of field failure and power loss.

Embedded device manufacturers – find out how Reliance Edge can help bulletproof your critical data.



Are existing SMB solutions scalable and can they cope with current demands?

One or two parents required to work from home, online learning for the children, and even recreation all consist of streaming media of some sort. For more secure work, VPN is often the norm. Can all aspects of the internet handle the traffic?

In the current circumstances of home-centric everything, internet service traffic has skyrocketed in both professional and recreational spheres. The streaming service Netflix, for example, saw an unprecedented gain of 15.8 million subscribers for the first quarter of 2020. In my own region of Seattle and Washington State, internet traffic is up considerably – 30% to 40% higher than in January of this year. In response, local service providers such as Comcast and T-Mobile have waived their bandwidth caps, at least in the short term. One of their concerns is this stress test of the "last-mile" services - the modems, routers and other components of home networks.

SMB protocol is more relevant than ever for shared content access

Besides the need for high throughput – or high transfer speeds – another concern is secure access to shared files, and this is where networking protocols come in. Home routers connect to the enterprise local area network (LAN), often through VPN. Many workers staying at home connect through individual paths to a few enterprise servers, and Server Message Block (SMB) is the protocol that allows the sharing of the common files they need to do their jobs.

SMB servers can be open source solutions or proprietary implementations. The most commonly used implementation is called Samba, a helpful open-source alternative. Tuxera maintains its own proprietary implementation – Fusion File Share by Tuxera – with commercial-grade SMB features and enhancements that will handle the current stresses content providers and enterprises are facing during the COVID-19 epidemic – multiple users accessing the same content over the network.

Scalability is critical when countless organizations have switched to remote work

The key measurement for the current situation is scalability, because these network protocols need to provide files to more than just a few people – we’re talking 10s, 100s, even 1000s in the case of a large global enterprise such as a banking or medical institution. Companies are worried if their storage solutions can handle all the load of remote work. When an entire company hits the shared file at once, will all their requests get through without serious delay or even critical failures?

Increased loads have shown Samba can easily max out CPU and memory usage at 100%. This illustrates the challenges facing SMB protocols in today’s crisis. While Samba can be tuned to handle speed issues, implementing proper security and scalability measures unfortunately demands more human and infrastructure resources, increasing costs.

Final thoughts

The increased networking demands we’ve discussed place significant stress on widely used SMB services, with results felt across multiple industries, from banks to medical institutions. These disruptions can put organizations that are integral to societal function at risk. What’s worse, these risks are exacerbated given the uncertain nature of the current pandemic. This wasn't the use case that most network solution providers envisioned, but this is where we are today. Networking protocols that are sluggish and unreliable are simply unacceptable in a world that requires rapid data access.

Thankfully, solutions do exist to help network providers easily tackle speed and scalability in SMB. Latency and client overload are something Tuxera has tested for in SMB networking events for years, and we stand proudly behind our solution.

But regardless of the solution chosen, network service providers must evaluate how they can stay prepared for the scalability and security needs of the crisis today – as well as the needs of tomorrow.



Are you formatting your SD memory cards optimally?

We are excited to share with you an article from one of our valued partners, the SD Association. The following is a snippet from the original article. Be sure to read the full article here: The SD Memory Card Formatter – How this handy tool solves your memory card formatting needs.

For many of us, SD memory cards are an easy way to keep our important files and precious memories stored safely. But after using an SD memory card for a long time, files may begin to fragment, which can result in performance deterioration of the card. That’s when we use simple reformatting methods to wipe cards clean in an effort to restore their reliability and performance. Proper formatting is therefore essential in keeping our critical document files and favorite photos or videos available for future viewing.

First-rate formatting with the SD Memory Card Formatter

When formatting an SD memory card, specific tools and methods are required in order to ensure an effective process with minimal data loss.

The SD Memory Card Formatter, developed by Tuxera, handles SD memory cards in accordance with standards defined by the SD Association. In fact, it’s the official tool for formatting any SD, SDHC, and SDXC memory cards, as recommended by the SD Association. By optimizing an SD memory card to SD Association standards, the SD Memory Card Formatter safely improves the card's performance and lifetime. Operating system (OS) built-in formatters are rarely tested as rigorously, and often may not follow these standards as closely, resulting in formatting processes that are less reliable – and potentially leading to sooner memory card failure.

The SD Memory Card Formatter is designed to be the best tool for the job, for virtually every type of user – offering you the highest level of reliability and data integrity for all of your formatting and reformatting needs.

Read the full article on the SD Association’s website for more information and technical details on how the SD Memory Card Formatter can help you.


Embedded World 2020 wrap-up: smaller players and display tech get the stage

Tuxera has attended Embedded World for many years now, it being one of the premier events for embedded technology not just in Europe, but the world. It’s always an excellent opportunity for us to speak directly to our partners and other industry players. This year was no different, and though an impact from fears of illness was expected – after all Mobile World Congress was cancelled – the event proved insightful and productive.

Breathing room for meaningful interactions – and fun

Attendance was roughly a third of the previous year, and many companies opted out of exhibiting – some at the last minute. The organizers however did a good job of covering for missing booths, moving exhibitors from hall 5 into hall 4, and setting up places to sit in many parts of hall 4A – including an area with a beach theme! Nevertheless, it was a bit weird to walk by some large, fully constructed booths with no people or equipment in them.

There were also fewer people whose sole job was to pass out flyers and invite you into their stand. That led to more substantial conversations with more knowledgeable booth staff for the attendees.

Key meetings and greetings

It also seemed that there was a higher ratio of academics-to-working-professionals. Unlike in prior years when the bulk of students visited on Thursday, eager pupils wandered into our booth on all three days. I had the opportunity to demonstrate our GPL version of Reliance Edge and hear about some of the interesting projects they were working on. Perhaps the added data integrity of our file system will lead to their success!

Another silver lining was the opportunity for exhibitors to spend time with other exhibitors – visiting booths, seeing the demonstrations, and comparing notes. We came away with a few more partnership opportunities than in previous years, when we were busy talking to designers and students. I am especially excited about opportunities with Toradex – who premiered a product to assist those migrating from Windows Embedded to Linux. We also had a chance to explore deepening our partnerships with Green Hills in automotive and Mentor Graphics in resource-constrained certified markets.

Industry trends on display

A big theme this year among exhibitors were graphical interfaces and display technologies. Many of the exhibitors were discussing graphical interfaces and ways to speed up debugging. There were also some truly impressive display technologies, including large transparent screens and flexible ones.

Without the large semiconductor companies at the event, smaller players had a chance to get their messages out. There were a lot of special purpose chip vendors around, but far less bleeding edge chips shown due to the lack of big player attendance. As a result, special purpose chips and bespoke systems developers (as well as open source consortia) had audiences that would have probably overlooked them while busy talking to impressive players like ST Microelectronics and Wind River Systems in other years. At least one customer we spoke with had made the decision not to attend directly due to ST Microelectronics’ absence.

Looking onwards

The dates for next year have already been selected, and it is likely we will attend. What will be interesting to see is the impact this year’s show has on our plans for later this year, and of course next year. We’re already getting excited for Electronica this November, and the chance to meet more big players there. Will large booth companies like ST Microelectronics be back in the same space? Can the organizers do anything to improve attendance? And what new trends will happen to embedded designs in the next year? Always interested to find out!


One small step to a reliable file system

The Reliance Edge File System Essentials (FSE) is one of two API sets supported by Reliance Edge. It’s a minimalistic but reliable alternative to the POSIX-like option.

What are its benefits and how does it work? This feature summary should answer those questions.

The micro storage issue

Many small embedded designs don't have storage for data. Instead, programs on the device are simply loaded and executed. More sensors in the device and data-heavy situations mean a greater need to log some data – or decisions made – for later troubleshooting. Then again, some newer embedded designs are primarily used to gather sensor data, even if it is only until the device is in range of the cloud.

Such increases in data storage needs mean that system designers must eventually migrate from having no file system to needing something – if not a full POSIX implementation. They can take these steps on their own – treating storage as a memory pool here, storing data from multiple sensors there. While such an approach is doable, it opens the door to considerable increases in workload through complexity. For instance, storing data in two different "files" in a memory pool can mean load balancing. When doing this, a system designer also needs to consider unexpected interruptions, special media handling, and especially tests. Not an easy task.

A far better alternative is to hand the task over to experts. Until recently, though, the only file systems available for microcontrollers were simple FAT implementations, with little real thought towards fail safety or even performance. Reliance Edge changes all that – and File System Essentials provides a solid first step.

Reliance Edge FSE – A smart solution

Within FSE, there are no file names or paths. Instead, a set of numbered locations are defined. Within the code, they can be specified by #define values to make the project more readable. These locations can be read from, written to, and truncated. The size of these "files" can also be read. The number of available locations is fixed at compile time.

As mentioned earlier, multiple files increase the complexity of testing for a simple memory pool – doubly so if the effects of power interruption also have to be managed. With FSE however, all the tests are provided, and all core file interactions have already been validated. Reliability is provided by transaction points, also fully tested.

These tests were designed for Automotive SPICE, and Reliance Edge is written in MISRA C. While not necessary for a small embedded device, you can take comfort in just how fully tested this software is. Integrating Reliance Edge FSE into a project may be the simplest – and most effective – next step available today. No operating system required!

A compact, focused tool

Reliance Edge FSE has been designed to fill a very specific role. Like any highly focused solution, this means it sacrifices some breadth to achieve the levels of precision needed. The most obvious curveball of Reliance Edge FSE is that it’s not a full file system. Names, folders, handles, and file attributes are all missing. A file can never be "opened" or "closed" – it just exists.

Another aspect of FSE is that it’s modest with its disk space in order to stay tiny. And while this is a limitation, it’s probably a minor one for most designs. Reliance Edge FSE takes a simple, practical approach to your disk space needs – at format time, all the chosen files are created with zero size. There are no quotas or file size limitations, but all the data still has to fit in the available space.

Reliance Edge FSE uses just 3898 bytes of RAM – though that’s twice what was available for the entire Apollo 11 Guidance Computer, it speaks of how far flash storage needs have come.

Final thoughts

The File System Essentials API set can be a great stepping-stone from no file system at all to full POSIX. In fact, it can even be the only solution needed for some edge designs. With full tests, functionality, and reliability, it’s far better to use Reliance Edge than something written ad-hoc by even the best developers. So if you’re looking for a compact, no-frills tool to handle your embedded flash storage needs – we got you covered.

Embedded manufacturers – let’s work together to ensure your data storage needs are accurately met.


Embedded file systems – trickier than you think

The Electronic Engineering Journal published an interesting article by Jim Turley this week, discussing file system and the popular SD media used. While this article brings up some good points about media reliability, I’d like to dive a little deeper into two of the points he talks about – hopefully giving a bit more perspective. A file system designed for better reliability can be less tricky than you think.

Definitions of reliability

The users of embedded devices are probably not file system experts, and sometimes the designers of the devices aren't either. From the perspective of the user, they just want their data to be on the device when they expect it to. We think of this as data integrity. As the device ages, data retention also becomes a consideration – but that’s a topic for another blog post. Some of the techniques that protect the data integrity include journaling the data, using redundant writes, atomic updates like Tuxera product family, and transaction points provided by Tuxera's Reliance family of file systems.

The designer of the device may – or may not – care about the user data, but the absolute requirement from their perspective is that the device be able to boot and operate. This is the primary focus of most reliability improvements to file systems over the last decades – making the file system fail-safe. Some of the techniques used include logging or journaling the metadata, atomic operations, and utilizing the second FAT table to provide a pseudo-transaction – as in Microsoft Transactional exFAT (TexFAT). Most operations that protect the data integrity also provide a fail-safe environment for the system data.

Underlying all of this is the hardware, and as Jim Turley pointed out, reliability has to be a design concern from top to bottom, not just an add-on or an afterthought. The file system certainly can't prevent failures of the media – blocks or sectors going bad, in other words – but it should be able to detect and mitigate them.

Mitigating media problems

SD media fails in a number of ways, including failure to read or write, and returning erroneous data. The first two are easily detected by the file system, but the third can be a bit trickier.

Detecting erroneous data in the system data provides a different level of fail-safety, and this is often done with a CRC on the file system structures and metadata. The default file system on Linux can do this, but it is not enabled by default.

Once detected, the next step is handling the error – is recovery possible? For user files and folders, a disk check can mark those files – or restore data to a fixed name like FILE0000.CHK – and move on. While the user may lose data, at least the system continues to function. For system files and folders, the solution can be a lot more difficult.

Our files systems either transparently recover on-the-fly or optionally throw an exception for these situations, allowing the system designer to handle some situations gracefully. As an example, an error in the automotive design map data could result in an error message letting the driver know that map data is unavailable or corrupt, and that they should return to a dealer for an update.

The unhandled exception, utilized primarily in system validation, is also useful because it can lock the system down in a read-only state. This allows the test engineers to step in and see exactly where failure occurred, helping them quickly determine the root cause of the failure.

We can go one step further and provide optional CRC protection of user data files, taking user data integrity to a much higher level.

Final thoughts

While Turley's article does point out key design concerns, he suggests that the media is most of the problem. I've used this space to explain some of the file system choices for reliability, and how data integrity differs from fail safety. We also examined how detection of a problem can lead to possible solutions – or at least more graceful failures.

As we’ve seen, SD card reliability can be a tough nut to crack. But with the right expertise, it’s doable.

SD manufacturers – let’s work together to ensure your data is responsive, reliable, and fail-safe.

Contact us

On using U-Boot (universal bootloader) in embedded designs

"Das U-Boot, the 'universal boot-loader', is arguably the richest, most flexible, and most actively developed open source bootloader available." (Yaghmour, 2003)

Das U-Boot supports a wide range of ARM, PPC and x86 options, with rich documentation and the capability of both reading and writing the media.

It is important to note that this software is licensed under the terms of GPLv2, which requires full source code for “derived works” to be available to recipients of a compiled version. “Derived work” is generally understood to mean a single statically-linked executable, which is what U-boot is. On Linux, U-Boot supports a number of file systems, including ext2, 3, and 4, FAT, Squashfs, ZFS, and btrfs. It also includes the flash file systems JFFS2 and UBIFS – more on that later.

U-Boot on embedded Linux

As useful as U-Boot is for embedded designs, there are a couple of challenges that need to be addressed. The supported file systems in U-Boot are also supported by the standard Linux kernel, which means kernel files can be read from those same file systems, giving some flexibility for runtime updates. Before U-Boot, kernel images were typically read from sequential blocks on the media and usually weren’t stored within a file system at all. To update an image during runtime the entire update had to be written at once, with no protection for power interruption, which left system vulnerable to being “bricked” during an update.

Many system designers looking for a power fail-safe way to manage data often turn to the our file system, Reliance Nitro. But isn’t this a proprietary solution that uses a license that is not compatible with GPL? Yes, that is true for the full version of Reliance Nitro, however Tuxera provides a Reliance Nitro reader under a GPLv2-compatible license which can be used in the U-Boot bootloader. This solution can read files from a Reliance Nitro-formatted partition, even though the file system kernel objects are not yet loaded. Using Reliance Nitro, these kernel files could then be updated at runtime with confidence that random power loss won’t cause corruption.

An additional challenge exists for designs using raw flash media storage: a driver is required to read the NAND or NOR flash. On Linux, that driver typically is the Memory Technology Device (MTD). Versions of two flash file systems (JFFS2 and UBIFS) are provided in U-Boot for this environment. Provided that these choices meet your needs, you could be all set. However, all of the features of modern flash devices may not be supported or optimized in these solutions.

Can a developer just use one of the basic flash file systems to boot and then hand off to a more robust solution for normal operation? Unfortunately, not. The media manager portion of those flash file systems cannot handoff control to another media manager. If MTD is utilized to read the NAND for the bootloader, its use will also be required to control the NAND during normal operation. Fortunately, our proprietary flash management software, FlashFX Tera, because we provide a compatible flash information module (FIM) for MTD.

For non-Linux environments

Das U-Boot has also been widely adopted outside of Linux, in environments such as Wind River’s VxWorks and smaller kernel solutions such as FreeRTOS. Of the file systems built into U-Boot, only the unreliable FAT is supported by these operating systems – however, there are simple ways to add power failsafe reliability and cross-platform support even to these operating systems.

Tuxera’s file system solution for small environments is Reliance Edge. As this software can be licensed GPL v2, we have ported it to the U-Boot environment, allowing a system to use U-Boot to bring up an operating system image or kernel image stored on a Reliance Edge file system volume. Like the Linux environment, this means that image can be updated during runtime with confidence that the transaction points of Reliance Edge provide protection against power loss or other interruption.

Reliance Edge will work with any media solution that U-Boot can use, including the commonly available SD and eMMC media. What is not provided in many of these environments is a raw NAND media driver – the equivalent of MTD.

Until a better solution is available, the best option is to boot from media that is readable by U-Boot in those environments – SPI NOR flash or SD media. After the system boot is complete, a NAND driver (such as FlashFX Tera) can be loaded to directly access the NAND media for storage.

Have more questions?

If you have more questions on this topic, I’ll be happy to answer them for you. Just drop us a line through our contact form→ and we’ll get back with you as soon as possible.

Learn more about our products mentioned in this article.

Tuxera FlashFX TeraTuxera Reliance EdgeTuxera Reliance Nitro