Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

MILo-DB: a personal, secure and portable database machine

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

Mass-storage secure portable tokens are emerging and provide a real breakthrough in the management of sensitive data. They can embed personal data and/or metadata referencing documents stored encrypted in the Cloud and can manage them under holder’s control. Mass on-board storage requires efficient embedded database techniques. These techniques are however very challenging to design due to a combination of conflicting NAND Flash constraints and scarce RAM constraint, disqualifying known state of the art solutions. To tackle this challenge, we proposes a log-only based storage organization and an appropriate indexing scheme, which (1) produce only sequential writes compatible with the Flash constraints and (2) consume a tiny amount of RAM, independent of the database size. We show the effectiveness of this approach through a comprehensive performance study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. http://www.kuppingercole.com/.

  2. http://cyber.law.harvard.edu/projectvrm/Main_Page.

  3. A recent Microsoft survey states that “58 percent of the public and 86 percent of business leaders are excited about the possibilities of cloud computing. But more than 90 percent of them are worried about security and privacy of their data as it rests in the cloudhttp://news.cnet.com/8301-1009_3-10437844-83.html.

  4. Bloc nested loop Join is often the only Join algorithm provided in embedded DBMS products (e.g., for SQLite see http://www.sqlite.org/optoverview.html).

  5. This is not the case in high-end SSDs which can use relatively large RAM (e.g., 16 MB) to handle those constraints.

  6. Tests on 20 recent SD cards have shown that random writes are in average 1300 times more costly than sequential writes (min 130×, max 5350×) [30].

  7. A first layer (the Hardware Adaptation Level) of the controller software manages Low Level Drivers (LLD), Error Correction (ECC) and Bad Block Management (BBM). The second layer is the FTL, and it can be bypassed on most platforms.

  8. Moreover, the Flash Translation Layer becomes useless (thereby saving translation costs) and the garbage collection and wear leveling mechanism can be greatly simplified.

  9. While the strategy for handling deletes and updates is rather simple, the details on query compensation is a bit tricky and cannot be included in the paper due to size constraint.

  10. For the sake of clarity, we make here the assumption that at most one join path exists between two tables (e.g., a snowflake schema). The indexing scheme can be extended trivially to the multiple paths case.

  11. For example, the false positive rate using 4 hash functions and allocating 16 bits per value is 0.24 % [10]. Hence, Bloom Filters provide a very flexible way to trade space with performance.

  12. The result tuple identifiers are produced in reversed order, from the most recently inserted to the least recent inserted, which is suitable with the definition of Tselect Index \(\mathrm{I}_{\mathrm{T}_{j}.a\to \mathrm{T}_{i}}\) given in Sect. 4.1.

  13. Since TJoin indexes behave as normal tables, they are handled in the same way by the Merge operations, and thus, are not discussed here.

  14. The work required to recover from a crash during the reorganization can be minimized by storing on Flash the current state of each operation and analyzing this log at recovery time.

  15. Actually, it is worth managing a very small buffer (e.g., 1 page) in RAM to buffer several insertions of the same transaction.

  16. This calibration is important to take into account aspects that cannot be captured by the simulator (e.g., synchronizations problems when accessing the Flash memory). It impacts negatively the performance shown here roughly by a factor of 1.4.

  17. Note that giving a value of p for simple Flash devices like SD cards is difficult since FTL code is proprietary. It is however necessarily rather large because of their reduced cache capabilities. This is confirmed by the ratio between sequential and random writes (between 130 and 5350! [30])

References

  1. Agrawal, D., Abbadi, A.E., Wang, S.: Secure data management in the cloud. In: DNIS (2011)

    Google Scholar 

  2. Agrawal, D., Ganesan, D., Sitaraman, R., Diao, Y., Singh, S.: Lazy-adaptive tree: an optimized index structure for flash devices. In: PVLDB (2009)

    Google Scholar 

  3. Allard, T., Anciaux, N., Bouganim, L., Guo, Y., Le Folgoc, L., Nguyen, B., Pucheral, P., Ray, I., Ray, I., Yin, S.: Secure personal data servers: a vision paper. In: PVLDB (2010)

    Google Scholar 

  4. Allard, T., Anciaux, N., Bouganim, L., Pucheral, P., Thion, R.: Trustworthiness of pervasive healthcare folders. In: Pervasive and Smart Technologies for Healthcare, Information Science Reference (2009)

    Google Scholar 

  5. Anciaux, N., Benzine, M., Bouganim, L., Pucheral, P., Shasha, D.: Revelation on demand. In: DAPD (2009)

    Google Scholar 

  6. Anciaux, N., Bouganim, L., Guo, Y., Pucheral, P., Vandewalle, J.J., Yin, S.: Pluggable personal data servers. In: SIGMOD (2010)

    Google Scholar 

  7. Arge, L.: The buffer tree: a technique for designing batched external data structures. Algorithmica (2003)

  8. Bernstein, P., Reid, C., Das, S.: Hyder—a transactional record manager for shared flash. In: CIDR (2011)

    Google Scholar 

  9. Bityutskiy, A.B.: JFFS3 design issues. Tech. report (2005)

  10. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM (1970)

  11. Bolchini, C., Salice, F., Schreiber, F., Tanca, L.: Logical and physical design issues for smart card databases. In: TOIS (2003)

    Google Scholar 

  12. Bursky, D.: Secure microcontrollers keep data safe. PRN engineering services (2012). http://tinyurl.com/secureMCU

  13. Chan, C.Y., Ioannidis, Y.E.: An efficient bitmap encoding scheme for selection queries. In: SIGMOD (1999)

    Google Scholar 

  14. Debnath, B., Sengupta, S., Li, J.: SkimpyStash: RAM space skimpy key-value store on flash. In: SIGMOD (2011)

    Google Scholar 

  15. Elbaz, R., Champagne, D., Lee, R.B., Torres, L., Sassatelli, G., Guillemin, P.: TEC-tree: a low-cost, parallelizable tree for efficient defense against memory replay attacks. In: CHES (2007)

    Google Scholar 

  16. Eurosmart: Smart USB token. White paper (2008)

  17. Gemmell, J., Bell, G., Lueder, R.: MyLifeBits: a personal database for everything. Commun. ACM 49(1) (2006)

  18. Giesecke devrient: portable security token. http://www.gd-sfs.com/portable-security-token

  19. Haas, L.M., Carey, M.J., Livny, M., Shukla, A.: Seeking the truth about ad hoc join costs. VLDB J. (1997)

  20. Bonnet, P., Bouganim, L., Koltsidas, I., Viglas, S.D.: System co-design and date management for flash devices. In: PVLDB (2011)

    Google Scholar 

  21. Li, Y., He, B., Yang, R.J., Luo, Q., Yi, K.: Tree indexing on solid state drives. In: PVLDB (2010)

    Google Scholar 

  22. Li, Z., Ross, K.A.: Fast joins using join indices. VLDB J. (1999)

  23. Lim, H., Fan, B., Andersen, D., Kaminsky, M.: SILT: a memory-efficient, high-performance key-value store. In: SOSP (2011)

    Google Scholar 

  24. Menezes, A.J., Van Oorschot, P.C., Vanstone, S.A., Rivest, R.L.: Handbook of Applied Cryptography. CRC Press, Boca Raton (2001)

    Google Scholar 

  25. Moglen, E.: FreedomBox. http://freedomboxfoundation.org

  26. Muth, P., O’Neil, P., Pick, A., Weikum, G.: The LHAM log-structured history data access method. VLDB J. (2000)

  27. O’Neil, P., Cheng, E., Gawlick, D., O’Neil, E.: The log-structured merge-tree (LSM-tree). Acta Inform. (1996)

  28. Pucheral, P., Bouganim, L., Valduriez, P., Bobineau, C.: PicoDBMS: scaling down database techniques for the smart card. VLDB J. (2001)

  29. Rosenblum, M., Ousterhout, J.: The design and implementation of a log-structured file system. ACM Trans. Comput. Sci. (1992)

  30. Schmid, P., Roos, A.: SDXC/SDHC memory cards, rounded up and benchmarked. http://tinyurl.com/tom-sdxc

  31. Severance, D., Lohman, G.: Differential files: their application to the maintenance of large databases. ACM Trans. Database Syst. (1976)

  32. Sundaresan, P.: General key indexes. US Patent No. 5870747 (1999)

  33. Vo, H.T., Wang, S., Agrawal, D., Chen, G., Ooi, B.C.: LogBase: scalable log-structured storage system for write-heavy environments. Technical report (2012)

  34. Weininger, A.: Efficient execution of joins in a star schema. In: SIGMOD (2002)

    Google Scholar 

  35. Wu, C., Chang, L., Kuo, T.: An efficient b-tree layer for flash-memory storage systems. In: RTCSA (2003)

    Google Scholar 

  36. Yin, S., Pucheral, P., Meng, X.: A sequential indexing scheme for flash-based embedded systems. In: EDBT (2009)

    Google Scholar 

Download references

Acknowledgements

This work has been partially funded by the French ANR KISS project under grant No. ANR-11-INSE-0005. The authors also wish to thank Philippe Bonnet for his accurate comments on early versions of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicolas Anciaux.

Additional information

Communicated by Elena Ferrari.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Anciaux, N., Bouganim, L., Pucheral, P. et al. MILo-DB: a personal, secure and portable database machine. Distrib Parallel Databases 32, 37–63 (2014). https://doi.org/10.1007/s10619-012-7119-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-012-7119-x

Keywords

Navigation