Hello and welcome!

I am delighted to write my first blog post about my acceptance in GSoC 2023 with DBpedia. Throughout the summer, I will work on the Neural extraction framework project with DBpedia.

Background:

It all started last year when I came to know about GSoC. It was that time when I learned about open-source, and I liked the idea of anyone being able to contribute and change the codes of so many softwares we use every day. I found an organization and worked hard contributing and writing a proposal. Though I was not selected for GSoC, I added two new distance metrics to a vector database later that summer. Since then, I have kept looking for opportunities in open-source projects, especially those I use.

And here I am, writing about my acceptance in GSoC 2023 for the Neural Extraction framework project. I would thank my mentors for giving me helpful feedback while writing the proposal.

The Neural Extraction Framework:

The Neural Extraction project is in its third iteration this year, and I will be working towards enhancing and improving upon the work done in previous iterations. This project aims at extracting information in the form of relational triples(subject -> predicate -> object) from unstructured text in Wikipedia articles that can be added to the DBpedia knowledge base.

Text has much more information

Currently, tables and infoboxes are used to find relations between Wikipedia articles. In this project, we want to make use of the text of the article and identify wikilinks from entities in the wikipedia article, map those entities to resources in DBpedia, find the relation between the entities and then map this relation to a DBpedia property.

Throughout the proposal-writing process, I got a chance to read a lot of research papers related to NER, coreference resolution, relation extraction and also got a glimpse of their source code. Also, it was fun to try and test different approaches using popular frameworks like spacy, transformers etc.

Community Bonding period:

After the contributors were announced, the GSoC team had organized “welcome talks” and some fun activities like the “GSoC contributor summit”. Attending these was fun and also these were helpful to clear some doubts regarding the process. Our organization DBPedia also organized a meetup with all the contributors this year and it was amazing to meet all of them and learn about their projects.

As the coding period begins next week, I spent community bonding period learning about DBpedia, setting up myself for the coding period, exploring some amazing open source tools and also creating this blog site, this is the first time I am writing a blog!

As I will move through the project, I will try to document my learnings and project status through this.

Thank you!