If you take nothing else away from this post, let it be this: Functions should be short, or at least not overly long, and variables should be wide enough in scope to get the job done, but otherwise as narrowly scoped as is conveniently possible.
This article was something of a work-in-progress. It's something I had been picking up and putting down for months. I felt that it needed to be written, but for whatever reason I hadn't made it all the way through with writing it. So, here it is as a sort of part one state. Unfortunately, I don't get to the in-depth part on functions. I only cover scope here. Due to the length of this article, and time, I'm going to put that into a second article that will be titled similarly to this one. When I do get to the article on functions, the focus will mainly be on dividing the purpose of code up into logically related pieces of functionality in order to facilitate breaking them into functions and hiding unrelated pieces of functionality from each other, in order to maximize reusability and enhance maintainability, but for now, on to the basics of scope...
Scope in Python
This article discusses practical scoping of variables and functions in Python, but it applies to other procedural languages, as well. There's nothing groundbreaking here, but I still encounter experienced developers that just don't seem to have put thought into this and their code would benefit from this.
The target audience is anyone that has ever written Python and plans to continue doing so. This is essentially a continuation of my Yet Another Python Tutorial series, but isn't labeled as such because it's more of a discussion of definitions and practice.
Definition
To paraphrase Wikipedia, scope is the region in a program where a given name can be used to reference something.
Most think of only variables as having scope, but anything that is called by a name generally has scope. The thing being named could be a variable, function, class, module, etc.
Ideally, scopable elements should be scoped to the tightest degree possible while still affording convenient access and without requiring redundant effort (such as multiply redefining or excessive work in re-fetching or re-calculating.
Variables
The scope of a variable is from the line that defines that variable to the end of the function in which that variable appears.
Variables can also be descoped as a result of using the del
keyword. In which case, the variable's scope would be from the line that defines the variable to the line that deletes it.
Conditional logic can cloud the beginning of a variable's scope because it may not actually get defined if the condition is false.
For example:
if condition:
variable = value
print(variable)
Here, variable is only defined when condition
is True
. When condition
is False
, the call to print
will raise an exception, specifically: UnboundLocalError: local variable 'variable' referenced before assignment
.
This is a highly contrived example, but it's a very common situation. The way to fix this is either to define variable
at a wider scope, outside of the conditional, or to move the print
into the block where variable
is defined. The latter is most likely what needs to be done, but without the wider context of how variable
will continue to be used, that decision is entirely academic. If you find yourself in this situation, consider very carefully whether it's a good idea to initialize the variable to None
before the condition. It's quite common to do this, but there is almost always a better way.
Global Variables
Global variables are generally instantiated in the global scope as a program is being parsed by the Python interpreter.
If a global variable is used in a narrower scope somewhere, Python will actually shadow the global reference, unless the global keyword is used in the scope where it is being accessed. Basically, just be sure to use the global keyword if you need to change the reference.
The global keyword does not need to be used to reference a global if the value will not be changed. For example:
variable = 0
def fun():
print(variable)
fun()
This will print out the global value of variable
, which is 0
, without any need to declare it as being global.
variable = 0
def fun():
variable = 1
fun()
print(variable)
This will print out 0
because fun
will create a new variable
for the scope of fun
and then outside of fun
any references to variable
will be to the global. This is called shadowing, when a variable of a tighter scope has the same name as a variable of a wider scope, it can make code hard to read.
variable = 0
def fun():
print(variable)
variable = 1
fun()
This code will result in an error, specifically: UnboundLocalError: local variable 'variable' referenced before assignment
.
This happens because Python will create a locally scoped variable
within fun
, but its scope doesn't begin until the assignment after print
. So when it hits the reference in the call to print
, it's referencing the locally scoped variable instead of the global.
Do not use the global
keyword in a function for a global that didn't exist until that function was invoked. This is a maintenance nightmare and equivalent to a conditionally defined variable as it will only be valid if it's later referenced after the function that defined it. Instead, create the global outside of the function at the global scope, you could even do it by assigning its value based on the result of a function if needed, it's still better than having random functions inject new variables into the global namespace.
Class Variables
For non-static class member variables, scope has to play by the same rules. Generally variables are created in a class's __init__
function and are then accessible throughout the entire life of a given instance of that class.
Class instance variables should always be defined in that classes __init__
method. Not doing so is a maintenance nightmare and equivalent to a conditionally defined variable as they can only be used after the invocation of a function that defines them. Developers would have to know the intricate details of that class's use of the variable, and if there is some special test that needs to be done before an instance variable can be used, then it shouldn't be directly exposed and instead a method or property should protect access to it.
Do not initialize a set of instance variable immediately following the class
definition line, these aren't instance variables, these are where statics go.
For example, if you wanted to create instance variables, don't do this:
class Foo():
bar = None
baz = None
This is wrong. bar
and baz
are statics here, not instance variables.
Instead, you'd want to do something like this:
class Foo():
def __init__(self):
self.bar = None
self.baz = None
Static Class Variables
Static class member variables are created as Python parses the class that defines them and, generally, they are available until the interpreter exits. Sound like a global variable? That's because it is. It's a namespaced global variable that's accessible once the class definition containing it has been loaded.
Do not create instance variables with the same name as the statics. These make it confusing to maintain and it's less obvious what value is being referenced.
Functions
Functions can be thought of as pretty much just a variable that can be executed. Functions play by the same rules as variable scope. Though functions are usually instantiated at load time, much like global variables.
The practical scope for a function depends heavily on the arguments and state that are modified by that function. For closures, the function scope should be as tight as possible. For a member method, the scope is within that class, but you should always consider whether a function needs to be a member method or if it would be more appropriate being statically defined.
Good Practices
Python allows the author to do pretty much anything they please. Programmers are free to inject variables at any time into any object or class and delete them at any time they please. You can reach into the guts of another class, replace pieces of it, and the language will even help you do this. And sometimes you may have to in order to a provide a short-term fix for a piece of critical code.
with great power comes great responsibility.
-- someone who just molested the python runtime to monkey-patch a broken third party library, or maybe uncle Ben...
For maintainability of code, it's a good practice to not rely on uncommonly used or weird nuances of the language without good reason and without good documentation. In this case, good reason generally falls into one acceptable category: a temporary measure to work around a bug in a third party or possibly even a platform API.
You can't control other people's code, but you can write your own to a standard of excellence that inspires others to do the same in their own code.
References
I'll definitely have to make sure I can find some of those to back up what I've written, otherwise I've got some coding habits I'll need to change.
- Object Oriented Programming in Python, Variables and Scope
- An Article on Scope Resolution, by Sebastian Raschka
Image sourced from Pixabay for historic purposes.
Great post i'll save this to drop on new python users ... like my self.
For a new Python user, how's the language I used in the article?
Having been quite a while since I started using Python, it's no longer obvious to me when I adopt language that would be confusing for new users. Plus, when I realized how long it was getting, I started trimming unessential content, even though it wasn't finished yet...