当前位置:   article > 正文

SQuAD (斯坦福问答数据集)_stanford question answering dataset

stanford question answering dataset

SQuAD

Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.

SQuAD 1.1

the previous version of the SQuAD dataset, contains 100,000+ question-answer pairs on 500+ articles.

Training example
{
    "data": [
        {
            "title": "University_of_Notre_Dame",
            "paragraphs": [
                {
                    "context": "Architecturally, the school has a Catholic character. Atop the Main Building's gold dome is a go lden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend \"Venite Ad Me Omnes\". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.",
                    "qas": [
                        {
                            "answers": [
                                {
                                    "answer_start": 515,
                                    "text": "Saint Bernadette Soubirous"
                                }
                            ],
                            "question": "To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France?",
                            "id": "5733be284776f41900661182"
                        }
                    ]
                }
            ]
        },
        …
   ]
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
Inference example
{
	"data": [{
				"title": "Super_Bowl_50",
				"paragraphs": [{
							"context": "Super Bowl 50 was an American football game to determine the champion of the National Football League (    NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24\u    201310 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California. As thi    s was the 50th Super Bowl, the league emphasized the \"golden anniversary\" with various gold-themed initiatives, as well as temporarily suspending the tradition of nam    ing each Super Bowl game with Roman numerals (under which the game would have been known as \"Super Bowl L\"), so that the logo could prominently feature the Arabic num    erals 50.",
							"qas": [{
										"answers": [{
											"answer_start": 177,
											"text": "Denver Broncos"
										}, {
											"answer_start": 177,
											"text": "Denver Broncos"
										}, {
											"answer_start": 177,
											"text": "Denver     Broncos"
										}],
										"question": "Which NFL team represented the AFC at Super Bowl 50?",
										"id": "56be4db0acb8001400a502ec"
									}, {
										"answers": [{
											"answer_start": 249,
											"text": "Carolina     Panthers"
										}, {
											"answer_start": 249,
											"text": "Carolina Panthers"
										}, {
											"answer_start": 249,
											"text": "Carolina Panthers"
										}],
										"question": "Which NFL team represented the NFC at     Super Bowl 50?",
										"id": "56be4db0acb8001400a502ed"
									}
									...
								]
								...
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
predict result: predictions.json
{
    "56be4db0acb8001400a502ec": "Denver Broncos",
    "56be4db0acb8001400a502ed": "Denver Broncos",
    "56be4db0acb8001400a502ee": "February 7, 2016,",
    "56be4db0acb8001400a502ef": "Denver Broncos",
    "56be4db0acb8001400a502f0": "gold",
    "56be8e613aeaaa14008c90d1": "golden anniversary",
    "56be8e613aeaaa14008c90d2": "February 7, 2016,",
    "56be8e613aeaaa14008c90d3": "American Football Conference",
    "56bea9923aeaaa14008c91b9": "golden anniversary",
    "56bea9923aeaaa14008c91ba": "American Football Conference",
    "56bea9923aeaaa14008c91bb": "February 7, 2016,",
    "56beace93aeaaa14008c91df": "Denver Broncos",
    "56beace93aeaaa14008c91e0": "Levi's Stadium",
    "56beace93aeaaa14008c91e1": "Santa Clara, California.",
    "56beace93aeaaa14008c91e2": "\"Super Bowl L\"),",
    "56beace93aeaaa14008c91e3": "2015",
    "56bf10f43aeaaa14008c94fd": "2015",
    "56bf10f43aeaaa14008c94fe": "Santa Clara, California.",
    "56bf10f43aeaaa14008c94ff": "Levi's Stadium",
    "56bf10f43aeaaa14008c9500": "24\u201310",
    "56bf10f43aeaaa14008c9501": "February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California.",
    "56d20362e7d4791d009025e8": "2015",
    "56d20362e7d4791d009025e9": "Denver Broncos",
    "56d20362e7d4791d009025ea": "Denver Broncos",
    "56d20362e7d4791d009025eb": "Denver Broncos",
    "56d600e31c85041400946eae": "2015",
    "56d600e31c85041400946eb0": "Denver Broncos",
    "56d600e31c85041400946eb1": "February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California.",
    "56d9895ddc89441400fdb50e": "Super Bowl 50",
    "56d9895ddc89441400fdb510": "Denver Broncos",

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
SQuAD2.0

Combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/繁依Fanyi0/article/detail/965986
推荐阅读
  

闽ICP备14008679号