Lecture 9: Input/Output Formats -- Saving and Loading Data ========================================================== In this lecture we will work in SageMath with persistent data, stored on file. We cover three ways to save data permanently on file. The most basic way uses plain Python files. While the conversion of a SageMath number to string is straightforward, we must be careful to run the preparse command before the application of eval. SageMath objects can be saved and loaded respectively with the save and load commands. The lecture ends with an illustration of pickle. Data concerns not only numbers, but also includes source code definitions of functions which can be imported into a SageMath session. Files in Python --------------- At the most basic level, we can use :index:`files` to store data permanently. As an example, we take a 20-digit approximation for \ :math:`\pi`. :: x = numerical_approx(pi,digits=20) x To save the number x refers to we convert it to a string and then write the number to a file in the ``/tmp/`` directory. The name of the file is ``sagexnum.txt``. :: sx = str(x) file = open('sagexnum.txt','w') file.write(sx) file.close() The file is written in the current directory. To change the current directory, use the ``chdir`` of the ``os`` module after importing it via ``from os import chdir``. To see the listing of the files in the current directory, use ``listdir`` of the ``os`` module. :: import os os.listdir('.') We clear everything to make sure ``x`` is gone. :: reset() x Typint ``x`` then just shows the symbol ``x`` the reference of ``x`` to the 20-digit approximation of \ :math:`\pi` is gone. To retrieve the data, we will open the file for reading. :: file = open('sagexnum.txt','r') s = file.readline() print(s, 'has type', type(s)) So far, we have just executed pure Python, and retrieved a string from file. The application of ``eval`` directly on ``s`` will give a ``float`` on return, which is not what we want, given that we stored a 20-digit number on file. Consider the following cell in SageMath. :: x = preparse(s) print(x, 'has type', type(x)) y = eval(x) print(y, 'has type', type(y)) Commands in a SageMath cell are interpreted by a language which is Python -- for almost all of the time. Each line of code runs automatically through a :index:`preparse` before execution by the Python interpreter. To see how SageMath differs from Python, we use the ``preparse`` command. While ``s`` contains the string representation of the 20-digit floating-point number, that is: ``3.1415926535897932385``, the content of the string returned by ``preparse`` is different. In particular, ``x`` contains ``RealNumber('3.1415926535897932385')`` and after ``eval(x)`` we get an object of type ``sage.rings.real_mpfr.RealLiteral`` with as content the number ``3.141592653589793239``. Saving and Loading SageMath Objects ----------------------------------- The ``save()`` method and the ``load()`` function are much more convenient han working directly with :index:`Python files`. In this section we continue with the ``y`` from the previous section. If the ``y`` is lost, just do ``y = numerical_approx(pi, digits=20)``. To :index:`save` a SageMath object, we apply the ``save`` method to the object. The argument we give to the ``save`` method is a file name. :: y.save('sageynum') The result of executing the ``save`` is that the current directory contains the file ``sageynum.sobj``. We can check this asking for a directory listing. :: import os os.listdir('.') We could execute ``reset('y')`` to remove the reference to ``y`` but we may also declare ``y`` as a variable. :: y = var('y') y Declaring ``y`` as a variable removes the reference to the object it referred to. To retrieve a SageMath object from file, we use the ``load`` function. :: z = load('sageynum.sobj') print(z, 'has type', type(z)) On return in ``z`` is ``3.141592653589793239`` an object of type ``sage.rings.real_mpfr.RealNumber``. While the ``save()`` and ``load()`` are perhaps the most convenient ways to work with persistent storage, note that those methods work only for SageMath objects. You cannot save for example a Python list with ``save()``. Pickling Objects ---------------- Python has a :index:`pickling` mechanism which is also called :index:`serialization`. In this section we continue with the ``z`` from the previous section. If the ``z`` is lost, just do ``z = numerical_approx(pi, digits=20)``. :: import pickle s = pickle.dumps(z) s The pickled object is a string of a bytes. :: "b'\\x80\\x03csage.rings.real_mpfr\\n__create__RealNumber_version0\\nq\\x00csage.rings.real_mpfr\\n__create__RealField_version0\\nq\\x01KC\\x89X\\x04\\x00\\x00\\x00RNDNq\\x02\\x87q\\x03Rq\\x04X\\x12\\x00\\x00\\x003.4gvml245kc4d80@0q\\x05K \\x87q\\x06Rq\\x07.'" Now we will write the string ``s`` to file, and then later we will delete the string and :index:`reset` ``z``. :: file = open('sageznum.txt', 'w') file.write(s) file.close() We delete the string s and reset z. :: del(s) reset('z') print(z, s) Printing ``z`` results in a ``NameError``. Now we open the file, read the string from file and then load. We read all lines from file with the method ``readlines`` of a file object. :: file = open('sageznum.txt', 'r') lines = file.readlines() lines We join all the elements in lines into one string s. :: s = ''.join(lines); print(s) The output of ``print(s)`` shows the pickled representation of the real number. The type of ``s`` is a string and ``s`` itself is the string representation of a ``bytes`` object. Before we can reconstruct the number z from the pickled object, we evaluate the string to the bytes object. :: bytelines = eval(s) print(bytelines, 'has type', type(bytelines)) Now we can reconstruct ``z`` from the pickled object. :: z = pickle.loads(bytelines) print(z, 'has type', type(z)) And then we see that ``z`` once again is an object of type ``sage.rings.real_mpfr.RealNumber`` with value ``3.141592653589793239``. Assignments ----------- 1. Take a floating-point approximation of \ :math:`\sqrt{2}` with 30 decimal places. Assign this approximation to a variable, convert the value to a string, use Python to write the string object to a file, and close the file. Reset the variable that referred to the approximation. Open the same file again with Python, read the string from file, and convert the string into the same SageMath object that was stored. Verify that the value and type of the retrieved object is the same as the original object that was written to file. 2. Take a floating-point approximation of \ :math:`\sqrt{2}` with 30 decimal places. Assign this approximation to a variable and use the save command of SageMath to store this approximation to a file. Reset the variable that referred to the approximation. Use the load command of SageMath to retrieve the approxmation from file, verify that the value and type of the retrieved object is the same as the original object that was saved to file. 3. Take a floating-point approximation of \ :math:`\sqrt{2}` with 30 decimal places. Assign this approximation to a variable and write the pickled object to file. Reset the variable that referred to the approximation. Read the pickled object from file and reconstruct the SageMath object. Verify that the value and type of the retrieved object is the same as the original object that was stored to file. 4. We have covered three ways to store a SageMath object to a file. For each of the three ways, list one advantage and one disadvantage. Why and in which circumstances would you prefer one way over the other? 5. Do ``x = '3/4'``. Explain the difference between ``eval(x)`` and ``eval(preparse(x))``.