Let's say I have a simple problem involving returning the index of occurence of all characters in a string. I know you could just literally run one for loop and print it out, but let's say I have to return it in some data structure!
Other assumptions: We know for a fact that this is an ASCII string. No duplicate characters exist in the string.
I could do one of two things.
Initialize the hashmap beforehand with all the possible 128 keys and None
as values.
Iterate through the string and simply update the dictionary/hashmap
with the index as the key's value.
Iterate through the dictionary elements, and remove those key, value pairs where the value is None
.
ascii_occurrence = {'a': None, 'b': None, 'c': None ... char#128: None} #Initialize a hashmap with each of the 128 characters as key, and set None to its value.
for charIndex in string:
ascii_occurrence[string[charIndex]] = charIndex
indexMap = {k: v for k, v in ascii_occurrence.items() if v is not None}
print(indexMap)
Initialize an EMPTY hashmap with no keys or values.
Iterate through the string and create key, value pairs.
ascii_occurrence = {}
for charIndex in string:
ascii_occurrence[string[charIndex]] = charIndex
print(ascii_occurrence)
I'm certain about the time complexity in both cases being O(N), but I'm not certain about the space complexity of both the approaches.
Arguing about space complexities:
Approach 1, my space does not "DEPEND" on the size of input. You can assume that a hashmap with 128 keys already exists when you bought the computer to run the code for this specific purpose.. I'm only updating the value and not creating new keys and extending the hashmap depending on my input. In this case it is O(1).
Approach 2, the hashmap is initially empty with nothing in it, you had to populate it with key, value pairs by iterating through the string. So really.. How much you're populating your dictionary depends on the input size. In this case it is O(N).
Is my argument correct?
The complexity of your both approaches is O(N^2) and that's because you have an indexing at each iteration (string[charIndex]
). However your second approach is generally a better way to go in this case. But you could also do it in a more optimized way (in terms of run-time) using a dictionary comprehension as following:
ascii_occurrence = {charIndex: ind for ind, charIndex in enumerate(string)}
In this case besides not getting the characters with an indexing you don't need to assign items to a previously created dictionary. Instead, Python will creates the dictionary for you on demand which will save you calling the __setitem__
function at each iteration which itself is a combination of suspending and resuming the function frames.
The complexity of this snippet in terms of both run-time and memory is of course O(N).
Now, if you're looking for a way to be more optimized it's easily possible but you have to sacrifice a little bit of other thing. This is to say that if you want lesser run-time you should give up some memory and vice versa. But if you don't want to do this you may wanna consider creating your dictionary before you get to this point at all. You can create your dictionary at the time of creating the main string. There are also other tricky approaches that you can do here like creating a dict
from the enumerate object directly by passing it to dict
object. But in this case indexes will be the key and characters will become the value.
ascii_occurrence = dict(enumerate(string))
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments