Data Classes vs typing.NamedTuple primary use cases
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Fantascape Looping
--
Chapters
00:00 Question
02:08 Accepted answer (Score 156)
04:10 Answer 2 (Score 34)
05:02 Answer 3 (Score 34)
05:54 Answer 4 (Score 16)
06:44 Thank you
--
Full question
https://stackoverflow.com/questions/5167...
Question links:
[PEP-557]: https://www.python.org/dev/peps/pep-0557/
[Why not just use namedtuple]: https://www.python.org/dev/peps/pep-0557...
[that one]: https://stackoverflow.com/questions/4795...
Accepted answer links:
[Raymond Hettinger - Dataclasses: The code generator to end all code generators]: https://www.youtube.com/watch?
Answer 2 links:
https://shayallenhill.com/python-struct-.../
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #namedtuple #pep #python37 #pythondataclasses
#avk47
ACCEPTED ANSWER
Score 203
It depends on your needs. Each of them has own benefits.
Here is a good explanation of Dataclasses on PyCon 2018 Raymond Hettinger - Dataclasses: The code generator to end all code generators
In Dataclass all implementation is written in Python, whereas in NamedTuple, all of these behaviors come for free because NamedTuple inherits from tuple. And because the tuple structure is written in C, standard methods are faster in NamedTuple (hash, comparing and etc).
Note also that Dataclass is based on dict whereas NamedTuple is based on tuple. Thus, you have advantages and disadvantages of using these structures. For example, space usage is less with a NamedTuple, but time access is faster with a Dataclass.
Please, see my experiment:
In [33]: a = PageDimensionsDC(width=10, height=10)
In [34]: sys.getsizeof(a) + sys.getsizeof(vars(a))
Out[34]: 168
In [35]: %timeit a.width
43.2 ns ± 1.05 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [36]: a = PageDimensionsNT(width=10, height=10)
In [37]: sys.getsizeof(a)
Out[37]: 64
In [38]: %timeit a.width
63.6 ns ± 1.33 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
But with increasing the number of attributes of NamedTuple access time remains the same small, because for each attribute it creates a property with the name of the attribute. For example, for our case the part of the namespace of the new class will look like:
from operator import itemgetter
class_namespace = {
...
'width': property(itemgetter(0, doc="Alias for field number 0")),
'height': property(itemgetter(0, doc="Alias for field number 1"))**
}
In which cases namedtuple is still a better choice?
When your data structure needs to/can be immutable, hashable, iterable, unpackable, comparable then you can use NamedTuple. If you need something more complicated, for example, a possibility of inheritance for your data structure then use Dataclass.
ANSWER 2
Score 48
In programming in general, anything that CAN be immutable SHOULD be immutable. We gain two things:
- Easier to read the program- we don't need to worry about values changing, once it's instantiated, it'll never change (namedtuple)
- Less chance for weird bugs
That's why, if the data is immutable, you should use a named tuple instead of a dataclass
I wrote it in the comment, but I'll mention it here:
You're definitely right that there is an overlap, especially with frozen=True in dataclasses- but there are still features such as unpacking belonging to namedtuples, and it always being immutable- I doubt they'll remove namedtuples as such
ANSWER 3
Score 41
I had this same question, so ran a few tests and documented them here: https://shayallenhill.com/python-struct-options/
Summary:
- NamedTuple is better for unpacking, exploding, and size.
- DataClass is faster and more flexible.
- The differences aren't tremendous, and I wouldn't refactor stable code to move from one to another.
- NamedTuple is also great for soft typing when you'd like to be able to pass a tuple instead.
To do this, define a type inheriting from it...
from typing import NamedTuple
class CircleArg(NamedTuple):
x: float
y: float
radius: float
...then unpack it inside your functions. Don't use the .attributes, and you'll have a nice "type hint" without any PITA for the caller.
*focus, radius = circle_arg_instance # or tuple
ANSWER 4
Score 14
Another important limitation to NamedTuple is that it cannot be generic:
import typing as t
T=t.TypeVar('T')
class C(t.Generic[T], t.NamedTuple): ...
TypeError: Multiple inheritance with NamedTuple is not supported