我有两个单词列表和一个包含上千条新闻文章的数据库。
我想计算数据库中每篇文章中$ badwords和$ goodwords列表中有多少个单词。接下来,我想在badwords和goodwords列中保存每行两个结果(分别为$ badwords和$ goodwords)。我将使用cronjob运行此脚本。
我当前的表结构 最后两行为空
TABLE news
-----------------
|ID|newstitle|newscontent|badwords|goodwords|
|1| Rain in London | It is horrible depressive weather in this nice city. | EMPTY | EMPTY |
|2| France wins the WorldCup | The player made a great goal. | EMPTY | EMPTY |
我想要的表结构 最后两列中的$ badwords和$ goodwords数
TABLE news
-----------------
|ID|newstitle|newscontent|badwords|goodwords|
|1| Rain in London | It is horrible depressive weather in this nice city. | 2 | 1 |
|2| France wins the WorldCup | The player made a great goal. | 0 | 1 |
我当前的PHP代码
<?php
//the wordlists
$badwords = "depressive horrible";
$goodwords = "great";
//connection to the database
$servername = "localhost";
$username = "user";
$password = "pass";
$dbname = "db";
$conn = new mysqli($servername, $username, $password, $dbname);
// here is my sql query
$sql = " UPDATE news
set badwords = (SELECT count (*) from news
where newscontent LIKE '.%$badwords%.')";
//close the connection
$conn->close();
?>
如果我正确理解了您的问题,则需要检查数据库中是否存在某个单词列表。在这种情况下,您正在寻找这样的查询(也请根据所使用的数据库类在查询中使用转义,例如mysqli_real_escape_string()):
SELECT COUNT(*) AS `count`
,`newscontent`
FROM `news`
WHERE `newscontent` = '" . $wordlist . "'
如果要显示每个单词表在数据库中存在多少次,这是您需要的:
SELECT COUNT(*) AS `count`
,`newscontent`
FROM `news`
GROUP BY `newscontent`
如果要显示给定数目的单词有多少个字符串,这就是您要查找的内容:
<?php
$sql = new mysqli($host, $user, $password, $database);
$query = $sql->query('select * from `news`');
$summary = [];
while($record = $query->fetch_object()) {
$summary[count(explode(' ', $record->newscontent))]++;
}
echo '<pre>';
print_r($summary);
echo '</pre>';
如果以上都不是您想要的,经过4次阅读您的问题后,我真的不知道您要寻找的是什么。
更新的答案既然您已经更新了问题,我就会明白您想要什么。请参阅下面的更新的答案。
<?php
// your db connection ...
// array with good and bad words
$good = [
'awesome',
'neat',
'fantastic',
'great',
// and so on
];
$bad = [
'horrible',
'worst',
'bad',
'terrific',
// and so on
];
// if you keep using your string approach you can set $good and $bad with $good = explode(' ', $goodwords); and $bad = explode(' ', $badwords);
// fetch the record you need
$query = $sql->query('select * from `news` where `ID` = 1'); // insert parameter for your ID here instead of just 1
$newsitem = $query->fetch_object();
// set up good and bad word counters
$totalGood = 0;
$totalBad = 0;
// check how many times each word is mentioned in newscontent
foreach($good as $word) {
// add spaces arround the word to make sure the full word is matched, not a part
$totalGood += substr_count($newsitem->newscontent, ' ' . $word . ' ');
}
// check how many times each word is mentioned in newscontent
foreach($bad as $word) {
// add spaces arround the word to make sure the full word is matched, not a part
$totalBad += substr_count($newsitem->newscontent, ' ' . $word . ' ');
}
// update the record
$sql->query("
update `news`
set `badwords` = " . $totalBad . ",
`goodword` = " . $totalGood . "
where `ID` = " . $newsitem->ID);
关于文本解释的一件有趣的事情仍然是讽刺。您将如何处理类似“嗯,英格兰的天气又很好-像往常一样!” ;)祝你好运!
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句