#185 This code is snooping on you (a good thing!)

Python Bytes - A podcast by Michael Kennedy and Brian Okken - Luni

Categories:

Sponsored by Datadog: pythonbytes.fm/datadog


Brian #1: MyST - Markedly Structured Text


Michael #2: direnv

  • via __dann__
  • direnv is an extension for your shell. It augments existing shells with a new feature that can load and unload environment variables depending on the current directory.
  • Use cases
    • Load 12factor apps environment variables
    • Create per-project isolated development environments
    • Load secrets for deployment
  • Before each prompt, direnv checks for the existence of a .envrc file in the current and parent directories.
  • If the file exists, it is loaded into a bash sub-shell and all exported variables are then captured by direnv and then made available to the current shell.
  • It supports hooks for all the common shells like bash, zsh, tcsh and fish. This allows project-specific environment variables without cluttering the ~/.profile file.
  • Because direnv is compiled into a single static executable, it is fast enough to be unnoticeable on each prompt.

Brian #3: Convert a Python Enum to JSON

  • Alexander Hultner

Problem:

  • Enum values by default are not serializable.
  • So you can't use them as values in JSON.
  • and can't use them as values passed to databases.

Solution:

  • Derived enumerations, like IntEnum or custom derived enumerations are simple to define and serializable.
  • You can convert them to json and store them as database values.

Example:

    >>> from enum import Enum, IntEnum
    >>> import json
    >>> class Color(Enum):
    ...   red = 1
    ...   blue = 2
    ...
    >>> c = Color.red
    >>> c
    < Color.red: 1 >
    >>>
    >>> json.dumps(c)
    Traceback (most recent call last):
    ...
    TypeError: Object of type Color is not JSON serializable


    >>> class Color(IntEnum):
    ...   red = 1
    ...   blue = 2
    ...
    >>> c = Color.red
    >>> c
    < Color.red: 1 >
    >>> json.dumps(c)
    '1'


    >>> class Color(str, Enum):
    ...   red = "red"
    ...   blue = "blue"
    ...
    >>> c = Color.red
    >>> c
    < Color.red: 'red' >
    >>> json.dumps(c)
    '"red"'

Michael #4: Pendulum: Python datetimes made easy

  • via tuckerbeck
  • Drop-in replacement for the standard datetime class.
  • Time deltas
    dur = pendulum.duration(days=15)

    # More properties
    dur.weeks
    dur.hours

    # Handy methods
    dur.in_hours()
    360
    dur.in_words(locale="en_us")
    '2 weeks 1 day'
  • Intervals
    dt = pendulum.now()

    # A period is the difference between 2 instances
    period = dt - dt.subtract(days=3)

    period.in_weekdays()

    # A period is iterable
    for dt in period:
        print(dt)

Brian #5: PySnooper - Never use print for debugging again

  • Thanks @pylang23 for the suggestion.
  • With PySnooper you can just add one decorator line to a function and you get a play-by-play log of your function, including which lines ran and when, and exactly when local variables were changed.
  • Logs
    • every modified variable with value
    • which line of code is being run
    • return value
    • passed in parameters
    • elapsed time
  • Options to:
    • isolate logging to a section of a function with a with block
    • log to a file instead of stdout
    • extend watch to a list of non-local variables
    • extend watch to functions called by the function being decorated
  • All with a simple decorator and a pretty simple API

Michael #6: Fil: A New Python Memory Profiler for Data Scientists and Scientists

  • via PyCoders
  • If your Python data pipeline is using too much memory, it can be very difficult to figure where exactly all that memory is going.
  • Yes, there are existing memory profilers for Python that help you measure memory usage, but none of them are designed for batch processing applications that read in data, process it, and write out the result.
  • What you need is some way to know exactly where peak memory usage is, and what code was responsible for memory at that point. And that’s exactly what the Fil memory profiler does.
  • Because of this difference in lifetime, the impact of memory usage is different.
    • Servers: Because they run forever, memory leaks are a common cause of memory problems. Even a small amount of leakage can add up over tens of thousands of calls. Most servers just process small amounts of data at a time, so actual business logic memory usage is usually less of a concern.
    • Data pipelines: With a limited lifetime, small memory leaks are less of a concern with pipelines. Spikes in memory usage due to processing large chunks of data are a more common problem.
  • This is Fil’s primary goal: diagnosing spikes in memory usage.
  • Many tools track just Python memory. *Fil captures *all allocations going to the standard C memory allocation APIs.

Extras:

Michael:


Joke:

  • Senior dev: Where did you get the code that does this from?
  • Junior dev: Stack Overflow
  • Senior dev: Was it from the question part or from the answer part?

Visit the podcast's native language site