Advertisement

Python and Data Hiding

Started by May 26, 2009 05:22 PM
3 comments, last by Dauntless-v2 15 years, 5 months ago
So I have really been digging at python as of late and started tinkering with classes here and there. One thing I noticed was the lack of data hiding. Note: I use the term data hiding because encapsulation in general is more about abstraction in general data hiding is just a small part of that. Now in college I remember teachers hammering the aspect of hiding our data from outside access down my throat. So over all I have been wondering why python left something "apparently" this important out. From my reasearch via google and what not I learned Python was designed to give the programmer expressiveness compaired to forcing programmers into a certain practice. I am trying to wrap my head around this fact of not protecting your implementation data that is not ment to be modified. It seems to be layed more on trust then anything. I did notice a few methods of what I would call psudo data hiding in Python. one being using a _ as in self._a another using the self._a with __getattr__ another using properties with another also using self.__a what is the difference with these methods?
Data hiding is a means to procure encapsulation. Encapsulation is the true goal, not hidden data. The Python approach to getting encapsulation is by simply not using "private" data members. If something is particularly "private" you can use the double leading underscores to mark it as such, but this of course is nothing more than a marking. I never use it except to remind myself that such-and-such is intended to be used only within the class.

If I may use an analogy, this is not unlike the use of inheritance to implement polymorphism. Inheritance is simply a convenient and useful means of getting the benefits of polymorphism, but the point of including it is to use polymorphic code rather than to inherit from one class to another. Some languages do not have inheritance because it is unnecessary. Python chose to include it for some other reasons, but the run-time typing makes inheritance not necessary for polymorphic code.

I've never heard of a single underscore being used to signify private members and the docs say that only the leading double underscore causes the name mangling for data hiding.
Advertisement
yea the docs say 2x underscore to mangle the name. however, all over in explanations on google and in the python mailing list I see talk of a single underscore some mention properties and others using getattr. I think someone said that using a single underscore keeps it out of the api documentation or something to that nature not sure.

but none the less thank you for your explanation that makes sense.
Quote: Original post by blewisjr
I am trying to wrap my head around this fact of not protecting your implementation data that is not ment to be modified. It seems to be layed more on trust then anything.

Yes. Philosophically, Python espouses a trust of and respect for your client programmer. If you say something is not to be touched, then your client won't touch it. If, however, your client is intent on touching it, no amount of data hiding will prevent him/her from doing so - programmers have been known to offset by the size of preceding data members from the address of an object instance to access a following data member in C++. In effect, people who aren't going to respect your encapsulation aren't going to respect your encapsulation, so why spend a ton of effort implementing something of low practical benefit?

Single underscores don't really mean anything in Python. They won't be imported from modules when you do a from X import *, but other than that it's described as a "weak" internal use indicator. The Python style Guide can be found in PEP 8.
As you mentioned, Python doesn't really enforce data-hiding in the way that you mention. Remember that Python is a dynamic language and that you can add attributes or methods dynamically on an object.

class A:   passa = A()a.x = "something new"


Python's sense of encapsulation means that you don't have to worry about implementation details, not whether or not an attribute has public or private access. Although python doesn't explicitly support access modifiers, you can "fake" it.

A leading double underscore means that an attribute is private in the sense that if you try to access the attribute directly (via it's name), you won't be able to. Python does this by secretly name-mangling the variable with a leading underscore, the name of the class, and then the name of the variable. For example, if you have something like:

class Example:    def __init__(self):        self.__hidden = "Don't access me directly"    def get__hidden(self):        return self.__hiddene = Example()e.__hiddenTraceback (most recent call last):  File "<input>", line 1, in <module&gt;AttributeError: Example instance has no attribute '__hidden'e.get__hidden()"Don't access me directly"dir(e)['_Example__hidden', '__doc__', '__init__', '__module__', 'get__hidden']


A single leading underscore is simply an indicator to other programmers that they really shouldn't be directly accessing a variable (or method). It also means that when you do a 'from mymodule import *', anything with a single leading underscore doesn't get imported.

Python has a feature similar to .Net Properties (and in fact, python calls them Properties also). If you are using new-style classes (ie, inheriting directly or indirectly from object, or use __metaclass__ = type somewhere in your module), then you can use the property protocol. Basically what this means is that python will use accessor and mutator methods that you define, but they are called implicitly for you depending on the context.

You can sort of implement your own protection scheme by using the magic methods of __getattr__ and __setattr__ respectively. If these magic methods are implemented, whenever someone accesses (any of) a class's attributes, then these magic methods are called.

class Example:    def __init__(self):        self.__hidden = "Don't access me directly"    def get__hidden(self):        return self.__hidden    def __getattr__(self, name):        if name == "__hidden":             print """You really shouldn't be using me directly"""        else:             raise AttributeErrore.__hiddenYou really shouldn't be using me directly


The larger lesson here is that python does things differently. For example, instead of having interfaces like Java or C# that are bonafide types, python uses what is called "duck-typing". If it walks like a duck and talks like a duck, it is a duck.

It took me a little while to get used to dynamic programming, but now I love it. Granted, it does require a little more diligence, but I find that this enforces documentation (and docstrings are cool). And when things are dynamically typed, it actually forces you to think more polymorphically, because you may not exactly know what might get thrown in (or out). There are ways to check this at runtime through introspection, or using functions like callable (which is to be deprecated btw), but this is as ugly as using RTTI in C++.

This topic is closed to new replies.

Advertisement