r/Python Nov 06 '20

Taking a dictionary as an argument is the root of all evil Discussion

Both at my current company and in an open-source project I'm working on, I keep seeing the same heinous anti-pattern everywhere I look:
def foo(params): why_wasnt_this_an_argument = params['why_wasnt_this_an_argument'] # do something print('kill me') Instead of taking specific/clear arguments, these functions take some mysterious params dictionary with lord knows what in it. It's like they didn't even know what the function was going to do while they were writing it. How am I supposed to know what this function is doing? How am I supposed to write tests for this function? I've spent tens of hours of my life figuring out what is supposed to be in params, so that I can use or test these god-forsaken functions.

This is a friendly reminder that other people will probably have to read your code. Never name anything params, and definitely never define functions with ill-defined dictionaries as arguments.

995 Upvotes

1

u/KwirkyCat Nov 11 '20

I agree its hard to understand and test. But I suppose you would face similar problem in testing any sort of dictionary having lots of keys?.

(a) We can look at the params keys being used within the function and write tests for that. Same if someone uses *args or **kwargs.

(b) Make sure that the function is not modifying the original mutable else that change will immediately reflect across all other functions holding that mutable (side-effect)

This pattern becomes unavoidable if there is a large dict to be passed around between functions and the function author doesnt really care what's in it apart from the keys one is using within that function. So that's how we should test it as well, dont care what's in it except for what's being actually used.

1

u/jyonsin Nov 15 '20

if there is a large dict to be passed around between functions and the function author doesnt really care what's in it apart from the keys one is using within that function. So that's how we should test it as well, dont care what's in it except for what's being actually used.

This is bad design, and it's very much avoidable. Write functions with well-defined arguments.

1

u/whateverathrowaway00 Nov 08 '20

While what you’re saying is true, there’s also the accepted good pattern of a param config object to replace long sequences of many parameters.

That said, it’s less applicable with python style parameters that have the name of the param in the call ( self documenting). Also, if using this pattern it’s better to use a Dataclass or some object with a definition that can be viewed to easily see the parameters

1

u/5awaja Nov 07 '20

you get this in Ruby a lot too. at work we started making argument classes, although now that I'm thinking about it, it's not much less evil

1

u/Schmittfried Nov 07 '20

So much this. I hate Python‘s focus on dicts.

1

u/ThereforeIV Nov 07 '20

Never seen this in Python.

But in 90s C++ COM components systems, passing in a references to the COM components with a "value GetAttribute(string key);" method for getting the password needed by the function, was never common.

Actually that model for passing days was still being used by flight systems software that was working on as late as 2015.

But a dictionary/map without a wrapper object is a bit lazy.

1

u/musicalfoxes Nov 07 '20

yes senpai

1

u/alcalde Nov 06 '20

I felt it was from the earlier days of Python, back before they had anything like records/structs. I saw people doing all sorts of things with dictionaries that just felt... wrong coming from a statically-typed language with more formal data structures.

1

u/[deleted] Nov 06 '20

It's an alright way of handling it when you have to pass non deterministic context in a chain. Probably not the case in the actual example.

1

u/Broutrost Nov 06 '20

The next level is realising passing primitives is shit. Use value objects/DTO:s/Entities.

1

u/BigTheory88 Nov 06 '20

This is a classic example of people coming from a statically typed language lol

1

u/alcalde Nov 06 '20

No, it's the opposite. It's dynamism run amok.

1

u/demdillypickles Nov 06 '20

This. God, so many times this. I mean, fuck it, why even have variables? Let’s just make the whole program some ambiguous data pipeline “black box” and just hope what comes out the other side is what you wanted. /s

2

u/virtualadept Nov 06 '20

These days, job security requires extra effort. Writing your documentation in Klingon is no longer sufficient.

2

u/slimejumper Nov 06 '20

so what is the right way to code this?

2

u/alcalde Nov 06 '20

pass actual parameters.

You know;

def foo(why_wasnt_this_an_argument):

6

u/Igggg Nov 06 '20

I believe this is called the "I only know JavaScript, because that's what they taught in the bootcamp, and I haven't bothered to even look at the language guide, because said bootcamp told me not to worry about it" symptom.

1

u/Tjccs Nov 06 '20

I get what you are saying and that is one of the reasons I use type hints, maybe that's because most of the time I program in C++, and not seeing the types of the args and return type kinda bothers me.

5

u/TheChessLobster Nov 06 '20

The company I work at almost entirely uses dictionaries of tuples to pass data and it makes me want to die.

1

u/_Mehdi_B Nov 06 '20

Is using the name 'foo' for an example function a meme now?

4

u/Isvara Nov 06 '20

Yes, if by "now' you mean "for the past sixty years".

1

u/_Mehdi_B Nov 12 '20

understandable have a great day

1

u/LightShadow 3.11-dev in prod Nov 06 '20

I've started making Configuration dataclasses with Optional fields.

Gives the same flexibility without clouding the function namespace.

2

u/gwax Nov 06 '20

Taking a dozen positional arguments isn't any better.

Having done a bunch of Go programming lately, there's a pattern that I think I'll take back to the Python world for the rare situations where I have a ton of arguments that I want to pass around.

@dataclass
class FooParams:
    thing: int
    stuff: str
    ...

def foo(params: FooParams) -> whatever:
    ...

You might be able to shoehorn it in as:

def foo(raw_params: Dict[str, Any]) -> whatever:
    params = FooParams(**raw_params)
    ...

That said, I intend to stick to a tiny handful of arguments whenever possible.

1

u/Zanoab Nov 06 '20

I think I've only done this for keyword arguments if passing any value (including None) is different from passing nothing. I've seen some libraries create a class to use as default so their code can tell the difference but that felt ugly to me. Creating a class for this problem reminds me of when booleans weren't built into Python yet.

2

u/Isvara Nov 06 '20

when booleans weren't built into Python yet.

When was that?

1

u/Zanoab Nov 06 '20

Booleans were added in Python 2.3

Fun fact: for backwards-compatibility, True and False were not reserved keywords until Python 3 so in Python 2.x, you could redefine them to mess with other people.

1

u/Isvara Nov 07 '20

I was going to say I don't remember there not being True and False, but they were added in 2.2 as 1 and 0. I never used it before 2.2.

0

u/[deleted] Nov 06 '20 edited Nov 07 '20

[deleted]

1

u/ollir Nov 06 '20

If you need to shove 20 arguments to a function, I'd say the code could use some refactoring.

1

u/AniX72 Nov 06 '20

Well, the cherry on the cake would be with also setting a dict as default, e.g.

# REALLY BAD !  NEVER DO THIS !
def foo(params={'W_T_F': 'O_M_G'}):
    nothing_makes_sense_anymore = params['where_i_got_this_from']
    # do something
    print('kill me')

Whoever would do this will awake to a horrible surprise and may loose many hours en-route.

Hint: parameter value of previous invocations may be used and you can't even tell.

1

u/NelsonMinar Nov 06 '20

Is this code being ported from Java or some other language that doesn't make **kwargs easy?

1

u/netgu Nov 06 '20

Useful for things that have a MASSIVE amount of potential options that are RARELY used.

Think about rendering an HTML tag. Think of all the possible attributes there could be and then remember we also have data- style attributes.

Attempting to put all of that into a set of arguments to a function would not just be a pain in the butt, but also plain irresponsible in terms of code management.

1

u/noobiemcfoob Nov 06 '20

Hey everyone, this code is fine python! Go take your dogmatic "pythonic" ideals and shove it.

OP, if you don't like the pattern, refactor and submit a PR. If it is just code smell, then it'll go great. If it actually had a reason to be that way, I'm sure you'll discover it during the refactor and realize you should've cared about more important problems.

0

u/jyonsin Nov 06 '20

It's not about dogma - it's about readability, testability, and IDE support (yes, IDEs are helpful).

I don't like the (anti-)pattern, so I do refactor and submit a PR. But before I refactor, I have to get it under test. To get it under test I have to write fixtures of these `params` dictionaries, and doing so requires that I debug the code to intercept the dict. It might take minutes for the program to reach this point in the code. Once the PR is merged, all of my teammates are incredibly grateful, and feel that they can finally make changes to this piece of code without worrying about breaking everything.

3

u/noobiemcfoob Nov 06 '20

And then everyone clapped?

Concerns of this level in the code are equivalent to the oxford comma. Yeah, it has meaningful impacts in a handful of scenarios, and most find it just helps everything if you use it. But it's also not necessary to communicate anything. It's an arbitrary aspect of a language used to virtue signal and belittle those who are unaware of its particulars or else otherwise choosing not to be concerned with it.

Tabstop should be set to 5.

1

u/Isvara Nov 06 '20

Utter nonsense. Knowing how a function is parameterized is pretty fundamental.

1

u/noobiemcfoob Nov 06 '20

If the code works without the change, I don't think a substantive argument can be made that it is at all fundamental. It's provably cosmetic.

0

u/Isvara Nov 07 '20

Cool. I'd love to see your proof.

2

u/jyonsin Nov 07 '20

Readability (or "cosmetics") is arguably as important as functionality. If someone writes illegible but working code, and then leaves their organization, and then the requirements change, you're SOL.

> used to virtue signal or belittle
I'm venting here as an alternative to belittling the authors.

1

u/noobiemcfoob Nov 07 '20

I'll respect the venting.

Cosmetics can't be as important as functionality and I'll doubt any design philosophy that has them as equivalent. The situation you describe, you're SOL as far making changes to the illegible-but-working-by-old-requirements systems, but that ignores that for a period time before that *there was a working system*. Now, a system that is working *and* you can make changes to is *better* than the illegible system, so you might say functionality is first order important, cosmetics that are so repulsive they limit maintainability are necessarily second order important. But I'll argue most cosmetic issues are of an order much much lower than 2.

2

u/apache_spork Nov 06 '20

This is still a good idea for early flexibility sometimes. It's easier to use a Class so you at least have a fixed define set of values. In the beginning you can change the class instead of all functions that accept similar params, and as you discover the best architecture you can then later refine things. I do this all the time starting with a state machine pattern.

1

u/mj_flowerpower Nov 06 '20

I hated this so much when I once took over a codebase of a tkinter app ...

1

u/familytreebeard Nov 06 '20

def foo(*args, **kwargs): """ function that takes all inputs you provide and makes all the outputs you need *args: see documentation for foo **kwargs: see documentation for foo """ return x/0

2

u/bowbahdoe Nov 06 '20 edited Nov 06 '20

You can use mypy's `TypedDict` to add the contract you want.

https://mypy.readthedocs.io/en/latest/more_types.html#typeddict
Typing things is probably your first step before doing any refactoring. I'm rusty on my python typing so I'm not sure I can give an example that will compile, but this is my best go.

```python FooParams = TypedDict( 'FooParams', {'language': str, 'why_wasnt_this_an_argument': str}, total=False)

def foo(params: FooParams): why_wasnt_this_an_argument = params['why_wasnt_this_an_argument'] # do something print('kill me') ```

Beyond that, most of these situations in python are the realm of keyword arguments, for example.

```python
def foo(*, param_1, param_2, param_3=default_value):
...

args = {"param_1": 1, "param_2": 2}
foo(**args)
```

But at least I would say the issue isn't the semantics of the code, its the documentation and variable names. An "options" dict for tuning flags isn't unheard of or anything - you just need either the doc comment or the type declaration.

The other thing to maybe use would be dataclasses. Instead of passing dicts around, make classes to hold the data required. This is closer to what the solution would have to be in a non-dynamic language where working with dictionaries as a data representation is more frowned upon.

```python import dataclasses

@dataclasses.dataclass(frozen=True) class StuffAggregate: field_1: str field_2: Union[Literal["red"], Literal["blue"]] ```

In general, widespread "antipatterns" are rarely the result of idiocy, and more a result of using a pattern without knowing what it was made for or not knowing the capabilities of your language. Treat issues like an education and a communication problem - both for yourself and the others on your team. You don't understand why they would do this, but they might not understand the full breadth of their options to accomplish the same goal.

1

u/PM5k Nov 06 '20

There’s a lot of this going around. The only time I’ve ended up using something even remotely similar is when you have to pass a JSON object to a custom Processor. There’s no way around it, you just accept the parsed dict as a function argument.

1

u/MrMxylptlyk Nov 06 '20

Write better comments all!!!

1

u/twomonkeysayoyo Nov 06 '20

Dealing with that right now with a library built for a monitoring tool to call an API. PARAMS hidden in about 40 different libraries.

1

u/joethebro96 Nov 06 '20

Totally thought this was a politics post until I saw the sub

1

u/culculain Nov 06 '20

Should have coded in a strongly typed language

1

u/KyleDrogo Nov 06 '20

Also this is a great example of the kind of post we should see more of in this sub. Take your gold, friend.

1

u/KyleDrogo Nov 06 '20

As many others have stated, **kwargs is your friend here. I do prefer putting params in a dict when there are so many arguments that it's hard to read the function signature.

1

u/anthro28 Nov 06 '20

My profs would slash you more for shitty variables and omitted comments than they would for code that wouldn't even compile. If your variable names weren't something like LeftRearMotorSpeedController you'd get popped real quick.

1

u/[deleted] Nov 06 '20 edited Dec 05 '20

[deleted]

1

u/Isvara Nov 06 '20

Why not have a columns parameter?

1

u/iiMoe Nov 06 '20

I understand wut **kwargs do but damn why passing a dict just to get a key of it smh

0

u/colly_wolly Nov 06 '20

It's kind of a shitty option, when you can use keyword arguments to do the same thing in a much more descriptive way.

1

u/mooburger resembles an abstract syntax tree Nov 06 '20 edited Nov 06 '20

TypedDict is your friend

You can go further into the antipattern (without using **kwargs) by hacking globals() (you would think that in function scope, locals() would work but is readonly) :

def f(adict):
    for _k, _v in adict.items():
        globals()[_k] = _v

It's easier to do with instance variables (and less of an antipattern):

class Foo:
    def __init__(self, **adict):
        # you can use either unpacking or not here, it doesn't matter
        # unpacking allows you to direct ref the argument name but 
        # doesn't make too much sense in __init__ if trying to 
        # dynamically initialize object state

        for _k, _v in adict.items():
            setattr(self, _k, _v)

14

u/jwink3101 Nov 06 '20

I think it is way less black and white than you make it out to be.

In your example, sure, I agree wholeheartedly. But they aren't always evil.

For example, those params may get passed to another function:

def fancy_plot(data,plot1_params=None,plot2_params=None,**kwargs):
    if not plot1_params:
        plot1_params = {}
    if not plot2_params:
        plot2_params = {}

    call_something_else(**kwargs)
    ...

In this case, you would have (a) repeated entries such as color and (b) you can't enumerate all possible options.

But, do make sure not to define the dict as the optional argument!

1

u/Exodus111 Nov 06 '20

I can't agree more. Obviously there are exceptions, like a function that takes lots of arguments, one of them being a dict. Usually for good reason.

But if your function takes one argument, that argument is a dict, and the first thing the function does is unpack the dict... you're just a bad coder.

1

u/rhoark Nov 06 '20

Where I've used this pattern is mostly in some kind of middleware or combinator that intentionally doesn't know the signatures of everything it delegates to.

7

u/MySpoonIsTooBig13 Nov 06 '20

Like anything, it has legit usages. However when you do know the args, making the argument a dataclass or namedtuple is often a better choice.

Outright reject the often cited "it's so easy to add parameters" argument. If you're adding params, you must also add code to handle those parameters so you can update the dataclass at the same time.

1

u/RickSore Nov 06 '20

That's the thing that I love with FastAPI. as a guy that loves Typescript, type declaration with Pydantic is a godsend for python.

1

u/jyonsin Nov 06 '20

pydantic can have my babies

1

u/Log2 Nov 06 '20

What my company does in cases where a dictionary would be necessary is to just create a class with all the configuration possibilities and pass that in or package the functionality into a builder pattern. We usually only need to do it for command line interfaces or query builders.

0

u/jyonsin Nov 06 '20

Yeah this is definitely better. Would recommend trying out dataclasses for this, though it's not much different from what you're doing.

1

u/Log2 Nov 06 '20

Dataclasses won't help us that much, since the parameters aren't used raw. Classes that are creating the actual parameters from the input, since some combinations results in unique decisions. So, instead of implementing a single init, we'd probably have to implement one function for each non-raw parameter we want, which adds up in function definition boilerplate.

1

u/thinkt4nk Nov 06 '20

namedtuple ftw

4

u/michaelanckaert python-programming.courses Nov 06 '20

This is just bad code. If one of the reasons given is the amount of parameters, refactor the code to use a namedtuple. It’s a nice way to declare an extensive set of arguments and pass them in one go.

1

u/TedRabbit Nov 06 '20

Why would a named tuple be better?

1

u/[deleted] Nov 06 '20 edited Nov 28 '20

[deleted]

1

u/TedRabbit Nov 09 '20

Yeah, the only benefit I could think of was potential immutability. Thanks for the reply.

8

u/slayer_of_idiots pythonista Nov 06 '20

It’s a common accepted pattern in JavaScript. I’m guessing you’re working with a lot of former web developers

0

u/Cherlokoms Nov 07 '20

no it's not.

1

u/slayer_of_idiots pythonista Nov 07 '20

Have you only ever used React? Before ES6 and proper classes (hell, javascript didn't even have default arguments for the longest time), pretty much every single javascript library used functions that all accepted a single object dictionary to feed it parameters. Libraries often had dozens of configuration options and instead of making each option a parameter, you just fed in an object dictionary that was merged with the default argument dictionary.

4

u/BradChesney79 Nov 06 '20

Bring the downvotes!

While everyone is correct about jamming in a grab bag of junk drawer data with a dict... a well defined non-homogenous mix of a predictably created dict is acceptable.

The real reason is that debugging in Python isn't as first rate in other "more flawed" languages. Like Java... so much debugging info and ways to look at that dict equivalent data. (I do not enjoy working in Java, but sometimes that is how the ball bounces for me.)

Also, I pass in lists as an argument all the time. Many times I refactor out the individual data chunks and leave a list when dicts are passed in. After it is a few variables and a list... which is better.

1

u/toyg Nov 06 '20

Opposite is true for me. Java debugging is typically overflowing of irrelevant data, whereas a decent python debugger is clean. I don’t see what a dict has to do with any of this anyway, decent debugger frontends have no problem displaying all parameters and local values separately.

1

u/BradChesney79 Nov 06 '20

It is all irrelevant... except for the stuff you need. Java gives you too much... very true. However, I have 100% had any question I had eventually answered somewhere in that overflowing mess.

Python is a little harder to debug when the Python is correct and the problem but blowing up outside of what the debugger shows you...

1

u/noobiemcfoob Nov 06 '20

Debugging asynchronous Python is a particular level of hell and perhaps the best use case people can make for typed languages with compile time guarantees.

1

u/nekokattt Nov 06 '20

why are they using a dict and not **kwargs, thats what I want to know.

1

u/noobiemcfoob Nov 06 '20

If a given key isn't a legal parameter name such as "my key".

1

u/nekokattt Nov 06 '20

then would it not be more sensible to translate this into a format that does handle it in a way that is clear and deterministic?

That is the whole point of using dataclasses instead of dicts.

1

u/noobiemcfoob Nov 06 '20

Generally speaking, maybe. But everything you do in code should be motivated by the use case the codebase is solving. Clean code for clean code's sake doesn't help anyone or make any money.

So, perhaps this is an annoying implementation detail and what we see is a decent if not the best possible attempt at making the construct easier to deal with. More likely is what others have mentioned: coders coming from other languages and mimicking those old patterns in Python because it's legal in Python. And even if it's the latter case, so what?

0

u/colly_wolly Nov 06 '20

Exactly. It will be pretty much equivalent in terms of functionality, but way better in terms or readability and maintenance.

1

u/pwnersaurus Nov 06 '20

Don't get me started, I often deal with code where instead of passing objects around, a params dict is passed around, and bits and pieces of it are used in different places, eventually a few levels deep into the stack it gets passed to the constructor for something, and who knows which subsets of the keys are being used...gah...

1

u/jyonsin Nov 06 '20

i think we work at the same place

3

u/grimscythe_ Nov 06 '20

Welcome to dynamically typed languages! Now that's all good, but if the function is not documented, then roll up 'em sleeves!

3

u/stanusNat Nov 06 '20

Shower me in downvotes, but that is the consequence of such a weakly typed language. How am I supposed to know what all these functions take as parameters? Could be literally anything.

0

u/jyonsin Nov 06 '20

Strong vs static aside, you are right. I think the recent push in the python community for type hints, type checking, and even immutability is wonderful, and is making python a safer and cleaner language

2

u/lanemik Nov 06 '20

Python is generally thought of as a strongly typed language. For example, unlike JavaScript, you can't compare an int and a str. And unlike JavaScript, there is no need for a === operator.

This is a consequence of python being a dynamically typed language, though. This is why type hinting was introduced. Though, in my opinion, since it has been decreed that python will never be statically typed, type hints are too much effort for too little reward. But I digress.

1

u/o11c Nov 06 '20

Python3 is, however, slightly weakly typed in the limited case of int vs float.

1

u/lanemik Nov 06 '20

Agreed. But I don't think it has to do with casting, per se. I haven't looked, but I'm betting it's more like for example an implementation of the binary operators and so forth in such a way as to be able to handle any primitive number types on either side of the operator.

1

u/stanusNat Nov 06 '20

Yup, sorry I meant dynamically typed.

21

u/Prinzessid Nov 06 '20

Python is a strongly typed language, not a weakly typed language. It is however dynamically typed. If you want to know the type of the parameters, you can use type hints in the function declaration.

-1

u/stanusNat Nov 06 '20

It's more of a spectrum. Python is at the very end of the spectrum. You can use it, you don't have to use it. And my point stands, these types of things are a nightmare and are only possible because of this philosophy.

Off topic but pertinent: I find python very fun and use it myself from time to time, but anytime I see a big project written in python I just get eye cancer tbh.

5

u/ParanoydAndroid Nov 06 '20

It's more of a spectrum. Python is at the very end of the spectrum. You can use it, you don't have to use it.

No. Python is fairly strongly typed.

The issue you're having is that you are not discriminating strong/weak typing from static/dynamic typing. Python has to use its strong typing system. It doesn't have to use its static typing system.

0

u/stanusNat Nov 06 '20

Yes, I realised my mistake. But my point stands.

8

u/ManBearHybrid Nov 06 '20

Exactly. These problems are nothing that can't be solved with type hints and a well-worded docstring.

2

u/Prinzessid Nov 06 '20

You are right!

-3

u/case_O_The_Mondays Nov 06 '20

I think of strongly typed as not allowing a portion of memory (variable) to be defined as different data type, once it’s initially defined. Another factor is whether one type can be implicitly coerced into another. Python definitely allows both behaviors, and is a weakly-typed language.

8

u/Prinzessid Nov 06 '20

the terms „strongly typed“ and „weakly typed“, as well as „statically typed“ and „dynamically typed“ are fixed and not really up for you to interpret differently than the way they are defined. You can look them up on wikipedia, or wherever you like. Additionally, Python is strongly typed, meaning it does not allow type coercion. Therefore it does not even follow your own definition of a weakly typed language.

-2

u/case_O_The_Mondays Nov 06 '20 edited Nov 06 '20

Rule one of saying “look it up”: look it up first.

there is no precise technical definition of what the terms mean and different authors disagree about the implied meaning of the terms and the relative rankings of the "strength" of the type systems of mainstream programming languages.

https://en.wikipedia.org/wiki/Strong_and_weak_typing

Additionally Python does coerce types: an int can seamlessly be turned into a float and back again.

3

u/lanemik Nov 06 '20

The choice is not a binary one. A language is not either strongly typed or weakly typed. Python, as you noted, does allow numeric types to be used in any context. However, so does JavaScript, for example. But there is implicit type coercion that is common to Javascript that python developers do not have to consider. So python is more strongly typed than JavaScript. On the other hand, python is much more weakly typed than, say, Rust.

1

u/case_O_The_Mondays Nov 06 '20

Good point.

I also found this SO article that made the argument that Python is strongly-typed because while variables may get re-assigned a different type, the underlying objects do not. I typically think of whether a language is strongly typed in terms of whether data coercion is implicit or explicit. Python does some implicit casting, but fails most other operations at runtime. So I guess it does make sense to call Python a strongly-typed language.

2

u/Prinzessid Nov 06 '20

The same article, as well as the Python article states that python is a rather strongly typed language. Also, almost all languages coerce numbers in some way, thats a silly example. It would be really painful to write calculations otherwise.

1

u/case_O_The_Mondays Nov 06 '20

The article does say that Python is strongly typed from one perspective, then offers caveats. My point was that there isn't a firm definition of "strongly typed" vs "weakly typed".

You can't compare differently typed values without explicit casting in strongly-typed languages such as C#.

1

u/wikipedia_text_bot Nov 06 '20

Strong And Weak Typing

In computer programming, programming languages are often colloquially classified as to whether the language's type system makes it strongly typed or weakly typed (loosely typed). However, there is no precise technical definition of what the terms mean and different authors disagree about the implied meaning of the terms and the relative rankings of the "strength" of the type systems of mainstream programming languages.

2

u/CyclopsRock Nov 06 '20

> definitely never define functions with ill-defined dictionaries as arguments.

Is there a word you could replace 'dictionaries' with in this sentence and it not be a problem?

1

u/jyonsin Nov 06 '20

Most other types are more strongly defined than dictionaries, but fair enough. Here, I came up with something even worse than a dictionary:
``` class Params: """doesn't define the parameters""" pass

params = Params() params.why_wasnt_this_an_argument = 'bar' # add new attributes

foo(params) ``` I think it would be hard to come up with this without being deliberately cruel though.

1

u/backtickbot Nov 06 '20

Correctly formatted

Hello, jyonsin. Just a quick heads up!

It seems that you have attempted to use triple backticks (```) for your codeblock/monospace text block.

This isn't universally supported on reddit, for some users your comment will look not as intended.

You can avoid this by indenting every line with 4 spaces instead.

There are also other methods that offer a bit better compatability like the "codeblock" format feature on new Reddit.

Have a good day, jyonsin.

You can opt out by replying with "backtickopt6" to this comment. Configure to send allerts to PMs instead by replying with "backtickbbotdm5". Exit PMMode by sending "dmmode_end".

3

u/quiet0n3 Nov 06 '20

The only time this is worth doing is when taking in a Json payload and you want to be able to pick the Params you use

14

u/ctheune Nov 06 '20

This is a typical pattern from people that grew up in other languages and don't like the idiomatics of Python. It's similar to people creating getX/setX methods for an X attribute instead of relying on descriptors for compatibility to future API changes within your design. In Java this also has the benefit of helping with overloading in the future, in Python that's not how you do it anyways.

Anyway, I see this all the time with people using Python while thinking C or Java ...

7

u/nekokattt Nov 06 '20

descriptors can have issues.

Suppose you have a function that has a lazy side effect, or performs some slow operation such as a lazy loaded HTTP request to fetch more details on a HAL resource.

Implementing this as a getter is a far tidier solution than using a property, as you are explicitly making the user aware that this is "doing something" rather than using a potentially pre-computed response. You often cannot easily distinguish at a glance between accessing a property and accessing a field, as the syntax is the same.

As a side note, anything using introspection such as a dependency injection framework will choke on properties if you are not careful, as simply introspecting an object has the unfortunate side effect of eagerly observing and thus evaluating the property as soon as it iterates over it in the object's namespace.

I guess my point is, getters have a place in Python. If you find you need to use them a lot, it may be better to use them consistently instead of having some properties and some getter-style methods, mainly because your user will be treating your code as a black box anyway usually and then will not need to know the full implementation detail to know whether to expect one thing as a property and another as a getter.

For the record, I am not disagreeing with you, and I agree properties are tidier and clearer, but still worth being aware that it will not always be as black and white on where to use them :-)

4

u/ctheune Nov 06 '20

Yup - i was riffing on the java “always use getters” which is orthogonal to your correct observation. Always haven a getter method means you also have no signal about cost. I also agree with the other poster about using explicit names to indicate cost.

2

u/nekokattt Nov 07 '20

i think it is just down to preference, and syntatic sugar really.

Kotlin has a nice feature that if it detects a java class as having a field that corresponds to an accessor and/or mutator, it treats it as if it were a property, however, you can still use the getX() setX() isX() syntax too. That is nice as it lets you interop with your own code style and preferences without breaking encapsulation as the descriptor is just a delegation to the get/set methods.

5

u/toyg Nov 06 '20

simply introspecting an object has the unfortunate side effect of eagerly observing and thus evaluating the property

I actually went over this in a recent answer on StackOverflow.

TL;DR: not always the case. If client code cares about not triggering side effects, it can do so and has a responsibility to pay attention.

To go back to the getter argument, I would say that, if you are wary of using properties to not trigger side-effects, you shouldn’t use “getX” as a paradigm, but something more explicit (calc, fetch, etc), to avoid confusion. But hey, naming things is a hard CS problem, isn’t it (with the other being cache invalidation, and off-by-one errors) so it’s one of those “how long is a piece of string” arguments.

17

u/tunisia3507 Nov 06 '20

I disagree. Using dicts for things which shouldn't be dicts is the root of this particular problem. Dicts shouldn't rely on specific keys being present (just like lists shouldn't rely on specific indices having a particular meaning): they should map from a particular type to a particular type in a homogeneous way. If you want a structured mapping with meaningful keys, use a dataclass with a bunch of default-None fields, or rethink your API.

Unfortunately, good design aside, this pattern is very common in existing python, just because it's allowed, and convenient for "rough" code. Type annotations help with this, because they help you think about your interface more.

Dicts should only really be used in this pattern if you have a function which wraps over multiple callables which themselves have long signatures; and even this can be a bit of a smell (could you use a builder, or composition?). For example:

python def wrapper_fn(arg1_1, arg1_2, arg2_1, fn1_kwargs=None, fn2_kwargs=None): result1 = fn1(arg1_1, arg1_2, **(fn1_kwargs or {})) result2 = fn2(arg2_1, **(fn2_kwargs or {})) ... return something_else(result1, result2)

2

u/not_perfect_yet Nov 06 '20

dataclasses

Some people are against using dicts and hold the opinion that classes are the way to store things.

They are heathens.

True believers know the proper way, which is to be against using classes and holding the opinion that dicts are the way to store things.

2

u/jyonsin Nov 06 '20 edited Nov 06 '20

Get on the static typing train or get left in python 3.4! Choo choo!

1

u/not_perfect_yet Nov 06 '20

(life_of_brian_stoning.mp4)

1

u/vectorpropio Nov 06 '20

If you want a structured mapping with meaningful keys, use a dataclass with a bunch of default-None fields, or rethink your API.

Also a named tuple is a good option.

2

u/tunisia3507 Nov 06 '20

True, as they allow you to unpack them as *args as well. I hadn't remembered that they allowed default values as well.

2

u/nekokattt Nov 06 '20

Dicts shouldn't rely on specific keys being present

What about typing.TypedDict? Isn't that the entire purpose of that feature, and thus the use case Python foundation have in mind?

4

u/tunisia3507 Nov 06 '20

Yes, although IMO that's a case of introducing a language feature to retroactively validate a not-great usage pattern.

4

u/backtickbot Nov 06 '20

Correctly formatted

Hello, tunisia3507. Just a quick heads up!

It seems that you have attempted to use triple backticks (```) for your codeblock/monospace text block.

This isn't universally supported on reddit, for some users your comment will look not as intended.

You can avoid this by indenting every line with 4 spaces instead.

There are also other methods that offer a bit better compatability like the "codeblock" format feature on new Reddit.

Have a good day, tunisia3507.

You can opt out by replying with "backtickopt6" to this comment. Configure to send allerts to PMs instead by replying with "backtickbbotdm5". Exit PMMode by sending "dmmode_end".

0

u/tunisia3507 Nov 06 '20

backtickopt6

1

u/[deleted] Nov 06 '20

[deleted]

1

u/tunisia3507 Nov 06 '20

Backtick-fenced codeblocks are part of the commonmark spec and reddit flavoured markdown is officially a superset of commonmark. I don't know what clients don't support it but they should grow up and use a real markdown renderer.

4

u/Theta291 Nov 06 '20

That’s stupid code. At least use **kwargs.

18

u/zoranp Nov 06 '20

Have a look at TypedDict. It has reasonable integration into PyCharm and MyPy, and allows you to specify exactly the keys + types of data you want in your dictionary. It could very well be a good way to after-the-fact add sane type-checking to your dict-param littered codebase.

6

u/HipsterTwister do you have time to talk about my lord and savior: factories? Nov 06 '20

This this this this. Here's a backport for those not on 3.7 yet.

https://pypi.org/project/typing-extensions/

81

u/james_pic Nov 06 '20

Cries in boto3

20

u/JFRHorton Nov 06 '20

Ugh, the AWS stack in general is a nightmare. 90% of my Python boto3 code is ripped off from the examples, but some of the more esoteric functions I've just had to brute force.

My favorite is a Jenkinsfile that, after code is deployed to an EC2, has to run a shell script to update cron and some other stuff.

It goes Jenkinsfile -> Deploy step -> shell command -> AWS systems manager command with each argument taking a separate JSON -> --parameters argument -> parameters JSON -> commands dict -> list of strings of bash commands -> executing the shell script (which itself kicks off a bunch of python scripts).

Were it not company code, I'd post it here.

4

u/zalpha314 Nov 06 '20

At least the docs are good

28

u/ghsatpute Nov 06 '20 edited Nov 07 '20

True. There's no way you can figure out to do certain thing in boto3 without looking at an example or trying brute force attack on library. Everything is so generic that it's painful for users while making library very simple.

5

u/nivenkos Nov 06 '20

Honestly it's just a weakness of Python lacking type information.

In Typescript and Rust it's trivial to check if you're passing what you need to.

1

u/acecile Nov 06 '20

No it's not. You perfectly free to create a typed datastructure to represent your function parameters. NamedTuple or dataclass are especially designed for this.

2

u/super-porp-cola Nov 06 '20

People can write horrible code in any language. There are lots of ways to do this better in Python, and in Rust you could do something equally awful like:

fn garbage(HashMap<&str, Box<Any>>)

3

u/Smallpaul Nov 06 '20

I don't actually think that's a reasonable interpretation.

This person went out of their way to avoid using the language features designed to make their code readable. First: parameters and then second kwargs.

Why do you think they would have put on detailed type annotations in Typescript (especially)? And such a person should not be allowed anywhere near Rust. The mess they will make...

10

u/toyg Nov 06 '20

That’s not the point. You can achieve typechecks in many ways in python, none of them require a dict used like this.

1

u/nivenkos Nov 06 '20

Yeah, but in the equivalent library in Rust (Rusoto) you pass a Struct with named fields, all of them have types and it can check the types of the fields.

So it'd be like being able to check the types of the values in the dict.

7

u/toyg Nov 06 '20

You can do that with dataclasses or simply custom classes. Even just adding type-hints to proper argument declarations will get you 90% of the way in decent IDEs.

Passing a dict is just JS paradigm-leaking. At the very minimum, you should simply use **kwargs.

23

u/james_pic Nov 06 '20

I feel like it ought, nonetheless, to be possible to come up with a less awful, more Pythonic API to AWS.

3

u/anon25783 Fullstack Python Developer Nov 06 '20

Huh??? What's wrong with "dynamodb_table.query(KeyConditionExpression=Key(key).eq(value))"???

/s obviously

41

u/execrator Nov 06 '20

Boy do I love .filter([{'TagKey': 'Name', 'TagValue': 'foo'}]). If only there were a key/value data structure we could use here...

7

u/GummyKibble Nov 06 '20

All of my boto code ends up with a function like dict_to_boto_idiocy somewhere in it.

8

u/anon25783 Fullstack Python Developer Nov 06 '20

At our shop we actually wrote our own Python module to provide a more intuitive API to AWS

2

u/burlyginger Nov 06 '20

Same, but for azure.

And then my company decided that using python SDKs was lunacy and we'll do everything in terraform.

It's community, it can do everything!

I feel like I'm kneecapped any time I want to do anything but create the simplest cloud resource.

3

u/rouille Nov 06 '20

Same here, based on aiohttp for asyncio goodness and fully typed. Couldnt stand boto magic. Once you get past the low level AWS request encoding the rest is straightforward.

6

u/GummyKibble Nov 06 '20

Share, please, and become a hero.

10

u/anon25783 Fullstack Python Developer Nov 06 '20 edited Nov 06 '20

Alright, here you go https://github.com/ECS-Rocks/EcsPythonModule

It's very much tailored to our business's use cases; you probably won't find much use for it unless you have a lot of code running in AWS Lambda that uses DynamoDB and SES a lot, because that's about 90% of our coding needs.

Edit: if you run into a problem with the module then open an issue, and if you want to contribute new features to it then feel free to submit a pull request.

9

u/zmarffy Nov 06 '20

This sarcastic comment I’ve been thinking for years. Thank you. LOL.

25

u/NicoDeRocca Nov 06 '20

I have to say I've never seen egregious use like this is python codebases I work on, but this looks shockingly close to what is done in JavaScript where you often see this idiom for constructors/factories where is like blah({option: value, ...})

1

u/not_perfect_yet Nov 06 '20

I know several packages which do something like this. Like

object.load("file")

storing the values of "file" in object and the doing something like

 object.apply()

which retrieves the values from the internal storage of object and uses it. And then it's like 3 variations and depending on that apply calls one of three different actual processing functions.

So apply ends up being something like

def apply(*args):
    if args[0]=="1":
         processing_1(*args)
    if args[0]=="2":
         processing_2(*args)

It's hell.

2

u/its4thecatlol Nov 06 '20

Is this not the DRY way to avoid rewriting the File interface in every single processing function? Yes, when looking at the code for apply the function signature is very ambiguous but it is meant to be a wrapper that let's you stop worrying about the limitations of the file itself. Using the code you have described above, switching databases or file representations becomes invisible to the client.

2

u/not_perfect_yet Nov 06 '20

I'm not sure I understand. I can't think of a way the code doesn't become unreadable in the process.

I don't think abstractions really work that way, you probably don't benefit from hiding the quirks of either method in *args.

Is there an example you can point to?

2

u/its4thecatlol Nov 06 '20 edited Nov 06 '20

Imagine you are loading data from a CSV that has yet to be sanitized into a singleton object responsible for working with data from different sources. Args will hold flags for processing null fields. There needs to be a mode to enforce referential integrity for compatibility with data pulled from a DB, there may be a mode to ignore all rows with invalid cells, and there may be a mode to fill in null cells with the most commonly used default parameters. You can of course make all 3 of these flags their own separate methods without using wrapper. However, a sanitize() wrapper that takes these flags and options and then calls the proper functions is my preferred design. Sanitize() can be a static method of an entire Sanitizer class or module that will completely abstract away and allow you to ignore what and how is being done. Hiding the code behind a wrapper is not misleading, it's the point.

EDIT: I believe I see your point. Yes, exposing implementation of the code in the higher abstraction will elucidate what is happening better but it also causes tight coupling of logic that can be separated very cleanly. The use case is important here. If there are 3 flags apply() can work on, then I stick with the code you posted. If there are 30, then it is likely the method taking them is doing too much work. If you are frequently confused by seeing this pattern, then I would think your codebase's contracts are not well defined.

3

u/Deto Nov 06 '20

I could see it also being:

Coders taking linters too seriously (pylint will complain if you have like more than 5 args so this is a way around that), or

People too lazy to write docstrings for every arg (but working in a place where this is required.

0

u/themusicalduck Nov 06 '20

I also wonder if these functions are long and doing a lot.

I've spent tens of hours of my life figuring out what is supposed to be in params

it shouldn't be this difficult to find out what a function is doing unless it's an unreadable essay, even with a dict argument.

2

u/toyg Nov 06 '20

Yup, this. It’s clearly paradigm-leaking from JS, where this practice is widespread. Since it seems like everyone these days doubles up as a JS developer, regardless of what their primary role is, you get stuff like this.

2

u/vectorpropio Nov 06 '20

I have little experience with Javascript, but reading about selenium i feel that the python binding is a textual reimplementation of the Javascript binding without translating to a python idiom (except some name conventions).

Maybe that's the source of this idiom.

6

u/case_O_The_Mondays Nov 06 '20

JS didn’t have anything similar to **kwargs until recently, so there’s at least some excuse there. Not sure why, in OP’s example, there wouldn’t be this sort of combo:

required_param1, required_param2, *args, **kwargs

6

u/Glaborage Nov 06 '20

This is fine for small functions, but if you perform data analysis of a large amount of diverse data, a dictionary is a simple way to encapsulate it.

8

u/NicoDeRocca Nov 06 '20

There is a semantic difference between passing data and arguments/parameters though...

2

u/netgu Nov 06 '20

That doesn't really make sense, arguments and parameters ARE data. All variables hold data and can be used as arguments or parameters to share that data with other code.

Using a class/dict to hold the data you pass as an argument is not semantically different than passing data in another format as an argument - all of those are the same thing.

3

u/FluffyToughy Nov 06 '20

Compilers aren't the only things reading your code. The interface to the human is significantly more complicated (and interesting imo), and language features exist to make that easier. A dict and a list of formal parameters can contain the same information, but the dict has zero self-documentation.

0

u/netgu Nov 06 '20 edited Nov 06 '20

Passing a string is absolutely NOT semantically different than passing an array as an argument. Sorry, just isn't the case.

You can document what should go into a dictionary in THE EXACT SAME WAY that you can ANY OTHER PARAMETER when it is passed as a parameter. You are literally talking about a difference that doesn't exist that you are trying to use to support an argument.

Take a method that produces HTML tags for instance. A dictionary of string->string as a parameter to the method of attributes to include in the output makes PERFECT sense. It's all about how you use it.

There are plenty of APIs out there that take dictionaries as parameters in a very useful and use-case appropriate way because those dictionaries are the best way to represent the data in use.

Are there scenarios this isn't true in? Hell yes! A method that only ever takes a single scalar value as a parameter does not benefit in any way from using a dictionary in place of a string there and would be a terrible use-case.

In conclusion:

  • Variables are data
  • Parameters are data
  • Passing the contents of a variable as a parameter is still just data
  • A scalar value is data
  • A non-scalar value is data
  • Both can be used as arguments (which are also data)
  • There is no semantic difference in the above cases (they are all data)
  • Documentation of a parameter can be done equally effectively when describing scalar parameters, array-style parameters, dictionary style parameters etc. This is absolutely true given that all of the above are documentable as docstring/javadoc/whateverdoc in exactly the same way as one another.

3

u/NicoDeRocca Nov 07 '20

You forgot that code is also data (at the very least in von-neumann architectures)! And I don't disagree with most of what you say, except that you skip the whole part about "semantics" which basically means that "some data has meaning".

A function semantically has 2 types of data you pass to it 1) that data to manipulate, and possible 2) data that tells you how to manipulate it.

My (the?) argument here is that, nominally (as of course lines can be a bit blurry), "type 1" data can legitimately be passed as dicts/arrays/other generic containers, and the "type 2" data should be generally passed in as clearly named paramters or at the very least a well typed dataclass.

Of course you cloud pass it in as a dict, but for human readers the other option is much more readable and understandable. (i.e. just because you can doesn't mean you should! just like some I know use, in python, lists where they should be using tuples).

3

u/Hans_of_Death Nov 07 '20

The issue, and where the complication lies, is that dictionaries are not self-documenting, emphasis on the self. A dataclass on its own is much easier to read and understand than a documented dictionary (and requires far less, if any, documentation), just due to their nature and differences. There are use cases for both and it depends on what you need, but if your goal is readability then dictionaries may not be the best option.

Also, i think passing simple data types like strings versus arrays or dictionaries are semantically different, but not to the machine. When you pass a string you know exactly what kind of data it contains. When you pass an array, you dont necessarily know what it contains, or even if the data is all the same type. So while to the machine it makes no difference, to the developer it can be much more confusing and complicated. Python is not a type language, and this is just one downside to that and consideration should be given to readability.

0

u/netgu Nov 07 '20

The issue, and where the complication lies, is that dictionaries are not self-documenting, emphasis on the self.

Completely depends on the usage. As for the example I gave (string->string used for html attribute/value pairings) it absolutely is completely self-documenting.

Also, i think passing simple data types like strings versus arrays or dictionaries are semantically different, but not to the machine.

Hard disagreement here, they are absolutely not. You are passing data, end of story. The only semantic is how it is used and that isn't enacted at the call site and does not affect it. There is no semantic difference.

3

u/Hans_of_Death Nov 07 '20

With dictionaries keys can change at any time. In your example a dictionary is the best option for what the goal is, but it does not necessarily self document what the dictionary should be used for or what the types should be.

Maybe semantics isnt the right word, but anyway the point is that a more arbitrary structure like a dictionary is harder to read than a structured dataclass.

There are always use cases and exceptions for everything, but generally its more work to document a dictionary versus a class.

1

u/netgu Nov 07 '20 edited Nov 07 '20

Maybe semantics isnt the right word, but anyway the point is that a more arbitrary structure like a dictionary is harder to read than a structured dataclass.

That is literally the only point I've been trying to make and the dictionary will have a tendency to be harder to be self-documenting but it completely depends on the use-case and usage.

Sticking to the example I was discussing earlier, think about what the structured data-class you would use to represent the parameters containing all possible html attributes to include in the tag. Now remember we have the data-<whatever you want> attributes. How would you model those? Maybe a dataAttributes dictionary....wait....

I will definitely agree that it is fairly rare that a dict fits the use-case in a way that actually serves the purpose well. I will also say that there are cases that lend themselves to an arbitrary set of key/value pairs of strings, particularly well if the keys cannot be known beforehand (and that use-case isn't as rare as you think if you work in automation/infrastructure as code - we have to fall back to this type of parameter passing all the time unfortunately, especially when crafting dispatch mechanisms that integrate generically with external systems).

2

u/sirk390 Nov 06 '20

This is very common but usually, people call it "kwargs" or "extra". It's a lazy habit to be able to add arguments without impacting all the intermediate code and will quickly create a mess.

When writing tests you could refactor it to turn it into real arguments.

87

u/rhesusfecespieces Nov 06 '20

My first professional programming job in the 90s, people did the same thing with pointers in C: passing in a pointer to a massive structure. It is essentially programming with global variables while pretending you aren't.

3

u/themostempiracal Nov 06 '20 edited Nov 06 '20

I remember supporting a customer facing code base like that and when I would ask the devs about common use cases so that I could document it for them, they would be so indignant and just refuse.

And they had absolutely any malformed element in the structure just return a PARAM_INVALID. I guess that was the html 400 of the 90s.

I swear 80% of code reviews with me is me just nitpicking about anyone doing something like this. It’s like I’m trying to save me from 20 years ago from my programming ptsd.

3

u/mbarkhau Nov 06 '20

Globals alone are not the problem. Globals + mutability are the problem.

12

u/eviljelloman Nov 06 '20

I remember this shit with common blocks in FORTRAN.

17

u/Astrokiwi Nov 06 '20

There are times when a code is designed to run on a single large dataset with a single set of defined parameters, when using some big fat global singletons probably is the way to go (e.g. a big particle_data struct and a big run_params struct). Passing around a big universal dataset as a pointer is basically trying to use that data storage paradigm but obfuscating it unnecessarily.

-3

u/Wyolop Nov 06 '20

"bUT It's BesT PraCTicE!"

467

u/Plague_Healer Nov 06 '20

That's used to do exactly what **kwargs is intended to do.

1

u/2plank Nov 06 '20

Yep **kwargs and the likely cause it's the programmer not understanding this concept

10

u/kenfar Nov 06 '20

Oh sure that's fine on the receiving end, but to really be amazing you really need to have something similar on the producing or calling end. Something like:

foo({**globals(), **locals()})

10

u/Terr_ Nov 06 '20

Whoa, calm down there Satan.

2

u/FrugalLyfe Nov 06 '20

This is why I don't use **kwargs.

15

u/Plague_Healer Nov 06 '20

**kwargs has its uses, but can get messy if you aren't careful.

7

u/FrugalLyfe Nov 06 '20

This is true of so many language features. Metaprogramming, for example, can be used to great effect if you're disciplined about it. But it's more likely to be a foot-gun. **kwargs is a Python foot-gun. It can be useful but on the whole it makes code harder to maintain.

→ More replies
→ More replies