Week 3
Week 3 (July 11th - July 15th)
July 12th:
On this day I met with Professor Ordóñez Román and he introduced to me 2 new vision and language models, Google’s ALIGN Model and CoCa from Google. These two models where very interesting to read about and where very similar to the CLIP model which I have been using.
Moreover, I was asked to conduct the tests I did last week on a different CLIP model, the ViT-L/14@336px model. However, instead of using Google Collab which I had been using, I was introduced to Rice’s Vision computing infrastructure where I began using python scripts to automate the evaluation of these models.
July 13th:
Here’s some of the important tasks I accomplished this day:
- Familiarizing myself with Rice’s Vision computing infrastructure
- Installing the VPN on my computer
- Installing Python into Home Directory on Server - Anaconda for Linux
- Running a remote jupyter notebook
- Understand screens + GPU allocation on terminal
- Upload Winoground dataset into server under /scratch/vsf2
- Create a python script to evaluate larger CLIP model and run it on the servers
Along with completing these tasks over the last couple of days I was able to run the python script and generate a new table for my tests. Like I mentioned before, to run the test the script must upload both the CLIP model and Winoground challenge dataset and then use the images in the Winoground dataset to test the CLIP model, and in this case multiple models. This table depicts scores from Google Collab on the B/32 and L/14-336px CLIP models against the raw article scores.
July 14th:
Today, I met with my mentor for our biweekly meeting where we discussed possibly submitting my project to the Grace Hopper Celebration as well as honing in on what our project would look like as we move further. At this point we wanted to have a more concrete idea of our tests. Due to this I began brainstorming possible test ideas with the working knoweledge I had from the other articles.
Here is an outline of all the material considered throughout this process:
Estimating and Mitigating Gender Bias in Deep Image Representations
- In visual recognition tasks such systems using images containing people allow the risk of amplifying societal stereotypes
- Predicting gender from labels outputted
- Gender Bias
Gender Bias in Coreference Resolution
- Coreference systems, NLP models, exhibit gender bias
- Creating a new challenge corpus, WinoBias Benchmark, for analyzing text biases
- Bias using occupations and gender stereotypical roles
Towards Understanding Gender-Seniority Compound Bias in Natural Language Generation
- Women are often perceived as junior to their male counterparts, even within the same job titles Bias using U.S. Senatorship and U.S. Professorship
- Whether seniority impacts the degree of gender bias exhibited in pretrained neural generation models by introducing a novel framework for probing compound bias
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality
- Winoground, a novel task and dataset for evaluating the ability of vision and language models to conduct visio-linguistic compositional reasoning
- Given two images and two captions, the model must match them correctly
- Challenge: both captions contains a completely identical set of words, only in a different order
- New Proposal: Measuring and Mitigating Societal Biases in Vision and Language Models for Open Recognition
- Biases in age and gender in different roles (careers)
- Pastor
- Judges
From this brainstorm I came up with some starting ideas:
- Possibly use VisualNews dataset
- Maybe look at bias as it relates to gender and occupation
- Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language Models
July 15th:
This day, I finished working on the CLIP Large Model, ViT-L/14@336px. I created another table comparing the results/scores I got from running the python script on the server, on Google Collab, and raw results from the article.
Upcoming
- Keep checking resources discussed in previous meetings
- Brainstorm Test Ideas
- Look through all CLIP models