Please, help. I have a data file with 4 columns (userid, movieid, score, timestamp) that looks like this:
196 242 3 881250949
186 302 3 891717742
22 377 1 878887116
196 51 2 880606923
62 257 2 879372434
I am trying to create a nested dictionary that should look like this:
users = {'196': [('242', '3'), ('51', '2')], '186': ['302','3'] ...}
My code only picks up one tuple (movieid, score) for each userid:
def create_users_dict():
try:
users = {}
for line in open('u.data'):
(id, movieid, rating, timestamp) = line.split('\t')[0:4]
users[id] = (movieid, rating)
except IOError as ioerr:
print('There is an error with the file:' + str(ioerr))
return users
users = create_users_dict()
users = {'196': ('51', '2'), '186': ('302', '3')...}
Use setdefault:
def create_users_dict():
try:
users = {}
for line in open('u.data'):
uid, movie_id, rating, timestamp = line.split()
users.setdefault(uid, []).append((movie_id, rating))
return users
except IOError as ioerr:
print('There is an error with the file:' + str(ioerr))
users = create_users_dict()
print(users)
Output
{'196': [('242', '3'), ('51', '2')], '62': [('257', '2')], '186': [('302', '3')], '22': [('377', '1')]}
A possible alternative is to check if the key (uid
) is in the dictionary, in case is missing initialize the value with the empty list and then simply append.
def create_users_dict():
try:
users = {}
for line in open('u.dat'):
uid, movie_id, rating, timestamp = line.split()
if uid not in users:
users[uid] = []
users[uid].append((movie_id, rating))
return users
except IOError as ioerr:
print('There is an error with the file:' + str(ioerr))
As a side note you should not use id
as a name because it shadows the built-in function id.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments