Thursday, October 10, 2024

Knowledge Science Meets Politics. Unraveling Congressional Dynamics With… | by Luiz Venosa | Sep, 2024

Share


To start with, we’d like knowledge.

I downloaded knowledge on all of the legal guidelines voted on and the way every member of Congress voted from 2023 to 2024 as much as Could 18th. All the information is out there on the Brazilian Congress’s open data portal. I then created two totally different pandas dataframes, one with all of the legal guidelines voted on and one other with how every congress member voted in every vote.

votacoes = pd.concat([pd.read_csv('votacoes-2023.csv', header=0, sep=';'),  pd.read_csv('votacoes-2024.csv', header=0, sep=';')])
votacoes_votos_dep = pd.concat([pd.read_csv('votacoesVotos-2023.csv', sep=';', quoting=1) , pd.read_csv('votacoesVotos-2024.csv', sep=';', on_bad_lines='warn', quoting=1, encoding='utf-8')])

To the votacoes dataframe, I chosen solely the entries with idOrgao of 180, which suggests they had been voted in the primary chamber of Congress. So, we’ve the information for the votes of most congress members. Then I used this the listing of the votacoes_Ids to filter the votacoes_votos_dep dataframe.

plen = votacoes[votacoes['idOrgao'] == 180]
votacoes_ids = plen['id'].distinctive()
votacoes_votos_dep = votacoes_votos_dep[votacoes_votos_dep['idVotacao'].isin(votacoes_ids)]

Now, within the votacoes_votos_dep, every vote is a row with the congress member’s title and the voting session ID to determine who and what the vote refers to. Subsequently, I created a pivot desk so that every row represents a congress member and every column refers to a vote, encoding Sure as 1 and No as 0 and dropping any vote the place greater than 280 deputies didn’t vote.

votacoes_votos_dep['voto_numerico'] = votacoes_votos_dep['voto'].map({'Sim': 1, 'Não':0})
votes_pivot = votacoes_votos_dep.pivot_table(index='deputado_nome', columns='idVotacao', values='voto_numerico').dropna(axis=1, thresh=280)

Earlier than computing the similarity matrix, I stuffed all remaining NAs with 0.5 in order to not intrude with the positioning of the congress member. Lastly, we compute the similarity between the vectors of every deputy utilizing cosine similarity and retailer it in a dataframe.

from sklearn.metrics.pairwise import cosine_similarity
similarity_matrix = cosine_similarity(votes_pivot)
similarity_df = pd.DataFrame(similarity_matrix, index=votes_pivot.index, columns=votes_pivot.index)
Similarity Matrix – Picture By the Creator

Now, use the details about the voting similarities between congressmen to construct a community utilizing Networkx. A node will characterize every member.

import networkx as nx

names = similarity_df.columns
# Create the graph as earlier than
G = nx.Graph()
for i, title in enumerate(names):
G.add_node(title)

Then, the sides connecting two nodes characterize a similarity of not less than 75% of the 2 congressmen’s voting conduct. Additionally, to deal with the truth that some congress members have dozens of friends with excessive levels of similarity, I solely chosen the primary 25 congressmen with the very best similarity to be given an edge.

threshold = 0.75
for i in vary(len(similarity_matrix)):
for j in vary(i + 1, len(similarity_matrix)):
if similarity_matrix[i][j] > threshold:
# G.add_edge(names[i], names[j], weight=similarity_matrix[i][j])
counter[names[i]].append((names[j], similarity_matrix[i][j]))
for supply, goal in counter.gadgets():
selected_targets = sorted(goal, key=lambda x: x[1], reverse=True)[:26]
for goal, weight in selected_targets:
G.add_edge(supply, goal, weight=weight)

To visualise the community, you must resolve the place of every node within the airplane. I made a decision to make use of the spring structure, which makes use of the sides as springs holding nodes shut whereas making an attempt to separate. Including a seed permits for reproducibility since it’s a random course of.

pos = nx.spring_layout(G, ok=0.1,  iterations=50, seed=29)

Lastly, we plot the community utilizing a Go determine and individually add the sides and nodes based mostly on their place.


# Create Edges
edge_x = []
edge_y = []
for edge in G.edges():
x0, y0 = pos[edge[0]]
x1, y1 = pos[edge[1]]
edge_x.lengthen([x0, x1, None])
edge_y.lengthen([y0, y1, None])

# Add edges as a scatter plot
edge_trace = go.Scatter(x=edge_x, y=edge_y, line=dict(width=0.5, coloration='#888'), hoverinfo='none', mode='traces')
# Create Nodes
node_x = []
node_y = []
for node in G.nodes():
x, y = pos[node]
node_x.append(x)
node_y.append(y)

# Add nodes as a scatter plot
node_trace = go.Scatter(x=node_x, y=node_y, mode='markers+textual content', hoverinfo='textual content', marker=dict(showscale=True, colorscale='YlGnBu', measurement=10, coloration=[], line_width=2))

# Add textual content to the nodes
node_trace.textual content = listing(G.nodes())

# Create a determine
fig = go.Determine(knowledge=[edge_trace, node_trace],
structure=go.Format(showlegend=False, hovermode='closest', margin=dict(b=0,l=0,r=0,t=0), xaxis=dict(showgrid=False, zeroline=False, showticklabels=False), yaxis=dict(showgrid=False, zeroline=False, showticklabels=False)))

fig.present()

Outcome:

Picture by the Creator

Properly, it’s a superb begin. Completely different clusters of congressmen will be seen, which means that it precisely captures the political alignment and alliances in Congress. However it’s a mess, and it’s not possible to essentially discern what’s occurring.

To enhance the visualization, I made the title seem solely whenever you hover over the node. Additionally, I coloured the nodes in line with the political events and coalitions obtainable on Congress’s website and sized them based mostly on what number of edges they’re related to.

Picture by the Creator

It’s loads higher. We have now three clusters, with some nodes between them and some larger ones in every. Additionally, in every cluster, there’s a majority of a specific coloration. Properly, let’s dissect it.



Source link

Read more

Read More