Lecture 9: Input/Output Formats – Saving and Loading Data

In this lecture we will work in SageMath with persistent data, stored on file. We cover three ways to save data permanently on file. The most basic way uses plain Python files. While the conversion of a SageMath number to string is straightforward, we must be careful to run the preparse command before the application of eval. SageMath objects can be saved and loaded respectively with the save and load commands. The lecture ends with an illustration of pickle.

Data concerns not only numbers, but also includes source code definitions of functions which can be imported into a SageMath session.

Files in Python

At the most basic level, we can use files to store data permanently. As an example, we take a 20-digit approximation for \(\pi\).

x = numerical_approx(pi,digits=20)
x

To save the number x refers to we convert it to a string and then write the number to a file in the /tmp/ directory. The name of the file is sagexnum.txt.

sx = str(x)
file = open('sagexnum.txt','w')
file.write(sx)
file.close()

The file is written in the current directory. To change the current directory, use the chdir of the os module after importing it via from os import chdir. To see the listing of the files in the current directory, use listdir of the os module.

import os
os.listdir('.')

We clear everything to make sure x is gone.

reset()
x

Typint x then just shows the symbol x the reference of x to the 20-digit approximation of \(\pi\) is gone.

To retrieve the data, we will open the file for reading.

file = open('sagexnum.txt','r')
s = file.readline()
print(s, 'has type', type(s))

So far, we have just executed pure Python, and retrieved a string from file. The application of eval directly on s will give a float on return, which is not what we want, given that we stored a 20-digit number on file. Consider the following cell in SageMath.

x = preparse(s)
print(x, 'has type', type(x))
y = eval(x)
print(y, 'has type', type(y))

Commands in a SageMath cell are interpreted by a language which is Python – for almost all of the time. Each line of code runs automatically through a preparse before execution by the Python interpreter. To see how SageMath differs from Python, we use the preparse command. While s contains the string representation of the 20-digit floating-point number, that is: 3.1415926535897932385, the content of the string returned by preparse is different. In particular, x contains RealNumber('3.1415926535897932385') and after eval(x) we get an object of type sage.rings.real_mpfr.RealLiteral with as content the number 3.141592653589793239.

Saving and Loading SageMath Objects

The save() method and the load() function are much more convenient han working directly with Python files. In this section we continue with the y from the previous section. If the y is lost, just do y = numerical_approx(pi, digits=20). To save a SageMath object, we apply the save method to the object. The argument we give to the save method is a file name.

y.save('sageynum')

The result of executing the save is that the current directory contains the file sageynum.sobj. We can check this asking for a directory listing.

import os
os.listdir('.')

We could execute reset('y') to remove the reference to y but we may also declare y as a variable.

y = var('y')
y

Declaring y as a variable removes the reference to the object it referred to.

To retrieve a SageMath object from file, we use the load function.

z = load('sageynum.sobj')
print(z, 'has type', type(z))

On return in z is 3.141592653589793239 an object of type sage.rings.real_mpfr.RealNumber.

While the save() and load() are perhaps the most convenient ways to work with persistent storage, note that those methods work only for SageMath objects. You cannot save for example a Python list with save().

Pickling Objects

Python has a pickling mechanism which is also called serialization. In this section we continue with the z from the previous section. If the z is lost, just do z = numerical_approx(pi, digits=20).

import pickle
s = pickle.dumps(z)
s

The pickled object is a string of a bytes.

"b'\\x80\\x03csage.rings.real_mpfr\\n__create__RealNumber_version0\\nq\\x00csage.rings.real_mpfr\\n__create__RealField_version0\\nq\\x01KC\\x89X\\x04\\x00\\x00\\x00RNDNq\\x02\\x87q\\x03Rq\\x04X\\x12\\x00\\x00\\x003.4gvml245kc4d80@0q\\x05K \\x87q\\x06Rq\\x07.'"

Now we will write the string s to file, and then later we will delete the string and reset z.

file = open('sageznum.txt', 'w')
file.write(s)
file.close()

We delete the string s and reset z.

del(s)
reset('z')
print(z, s)

Printing z results in a NameError.

Now we open the file, read the string from file and then load. We read all lines from file with the method readlines of a file object.

file = open('sageznum.txt', 'r')
lines = file.readlines()
lines

We join all the elements in lines into one string s.

s = ''.join(lines); print(s)

The output of print(s) shows the pickled representation of the real number. The type of s is a string and s itself is the string representation of a bytes object. Before we can reconstruct the number z from the pickled object, we evaluate the string to the bytes object.

bytelines = eval(s)
print(bytelines, 'has type', type(bytelines))

Now we can reconstruct z from the pickled object.

z = pickle.loads(bytelines)
print(z, 'has type', type(z))

And then we see that z once again is an object of type sage.rings.real_mpfr.RealNumber with value 3.141592653589793239.

Assignments

  1. Take a floating-point approximation of \(\sqrt{2}\) with 30 decimal places. Assign this approximation to a variable, convert the value to a string, use Python to write the string object to a file, and close the file. Reset the variable that referred to the approximation. Open the same file again with Python, read the string from file, and convert the string into the same SageMath object that was stored. Verify that the value and type of the retrieved object is the same as the original object that was written to file.

  2. Take a floating-point approximation of \(\sqrt{2}\) with 30 decimal places. Assign this approximation to a variable and use the save command of SageMath to store this approximation to a file. Reset the variable that referred to the approximation. Use the load command of SageMath to retrieve the approxmation from file, verify that the value and type of the retrieved object is the same as the original object that was saved to file.

  3. Take a floating-point approximation of \(\sqrt{2}\) with 30 decimal places. Assign this approximation to a variable and write the pickled object to file. Reset the variable that referred to the approximation. Read the pickled object from file and reconstruct the SageMath object. Verify that the value and type of the retrieved object is the same as the original object that was stored to file.

  4. We have covered three ways to store a SageMath object to a file. For each of the three ways, list one advantage and one disadvantage. Why and in which circumstances would you prefer one way over the other?

  5. Do x = '3/4'. Explain the difference between eval(x) and eval(preparse(x)).