glue API¶
The glue
call records a Scrap (data or display value)
in the given notebook cell.
The scrap
(recorded value) can be retrieved during later inspection
of the output notebook.
import scrapbook as sb
sb.glue("hello", "world")
sb.glue("number", 123)
sb.glue("some_list", [1, 3, 5])
sb.glue("some_dict", {"a": 1, "b": 2})
sb.glue("non_json", df, 'pandas')
The scrapbook library can be used later to recover scraps (recorded values) from the output notebook:
nb = sb.read_notebook('notebook.ipynb')
nb.scraps
scrapbook will imply the storage format by the value type of any
registered data encoders. Alternatively, the implied encoding format can
be overwritten by setting the encoder
argument to the registered
name (e.g. "json"
) of a particular encoder.
This data is persisted by generating a display output with a special media type identifying the content encoding format and data. These outputs are not always visible in notebook rendering but still exist in the document. Scrapbook can then rehydrate the data associated with the notebook in the future by reading these cell outputs.
Pandas¶
When glueing pandas dataframes, the library will use pyarrow to translate the dataframe to a base64 encoded parquet file. Because of this tool chain, certain nested objects will not encode cleanly and will raise an Arrow exception. Common nested objects that will fail include columns with dicts or sets within them, either directly or nested inside other objects. Over time these nested types should be more supported (nested lists work for example) as Arrow adds struct transformations.
Display Outputs¶
To display a named scrap with visible display outputs, you need to indicate that the scrap is directly renderable.
This can be done by toggling the display
argument.
# record a UI message along with the input string
sb.glue("hello", "Hello World", display=True)
The call will save the data and the display attributes of the Scrap
object, making it visible as well as encoding the original data. This
leans on the IPython.core.formatters.format_display_data
function to
translate the data object into a display and metadata dict for the
notebook kernel to parse.
Another pattern that can be used is to specify that only the display
data should be saved, and not the original object. This is achieved by
setting the encoder to be display
.
# record an image without the original input object
sb.glue("sharable_png",
IPython.display.Image(filename="sharable.png"),
encoder='display'
)
Finally the media types that are generated can be controlled by passing a list, tuple, or dict object as the display argument.
sb.glue("media_as_text_only",
media_obj,
encoder='display',
display=('text/plain',) # This passes [text/plain] to format_display_data's include argument
)
sb.glue("media_without_text",
media_obj,
encoder='display',
display={'exclude': 'text/plain'} # forward to format_display_data's kwargs
)
Like data scraps, these can be retrieved at a later time be accessing
the scrap’s display
attribute. Though usually one will just use
Notebook’s reglue
method (reglue).
An example using display data¶
For example, the following code generates a Matplotlib plot and saves only the display data as a scrap. This allows you to import the plot into another notebook.
# Generate our plot
fig, ax = plt.subplots()
ax.plot(x, y)
# We use *fig* as IPython knows how to display this.
sb.glue("sharable_plot", fig, "display")
This glues only the display information (e.g. the base64 encoded image generated by Matplotlib). In another notebook, it can be accessed and displayed like so:
nb = sb.read_notebook(path_to_first_notebook)
# To display the image and reglue it
nb.reglue('sharable_plot')
# To access the display information directly
nb.scraps['sharable_plot'].display['data']['image/png']