Getting strange results, how can I tell if programming error or php error (ussing arrays)? - Hack The Tech - Latest News related to Computer and Technology

Hack The Tech - Latest News related to Computer and Technology

Get Daily Latest News related to Computer and Technology and hack the world.

Wednesday, April 26, 2023

Getting strange results, how can I tell if programming error or php error (ussing arrays)?

Before I start, I am not sure if this is a programming error, my misunderstanding of PHP logic, or an error in the PHP program I am using. I am examining documents. I build word-trees arrays, each entry is of the form array(string,count) where count is the number of times the string has showed up in the document (so far). Once I add an item to the word-tree, I have to leave it in place because other parts of the program do not keep a copy of the string but keep the index into the table where the string can be found. However, the program creates a BTree which is a list of indices that can be used to list the strings in sequential order and allows a binary search for a string. Below is the code used to find the index for a passed string (word) and if necessary add the string to the word tree.

function PinWord($word,&$wordTree,&$bTree,&$WTI)    {
global $BTreeA, $BTreeO;
$F=0; $L=count($wordTree)-1;
if (count($wordTree)!=count($bTree))    exit ("mismatch at L=$L"); // should never happen
if ($L<0)   {// first time through ($WTI=-1 initially)
    $wordTree[++$WTI]=array($word,1);
    $bTree[0]=0;
    return 0;
    }
while ($F<=$L)  {// do a binary search for the string
    $i=intval(($F+$L)/2);
    $j=$bTree[$i];
    $test=$wordTree[$j][wv];
    if ($word==$test)   {
        $wordTree[$j][wc]++;    // bump count
        return $j;  }   //found it
    if ($word<$test)
        $L=--$i;    // change upper limit
    else    $F=++$i;    // change lower limit
    }
$WTI=count($wordTree); // redundant if everything working
$bt=array(); // going to rebuild $bTree
foreach($wordTree as $k => $A) $bt[$A[wv]]=$k;
if (count($wordTree)!=count($bt))   exit("duplicate words in word tree"); // shouldn't happen
if (array_key_exists($word,$bt))    {//oops shouldn't happen
    Debug_M("misplaced key i=$i t=$bt[$word] key='$word'");// report error
    return $bt[$word];}// out of order, return anyway
$bt[$word]=$WTI;
ksort($bt);//sort strings in ascending order
$bTree=array_values($bt);// replace $Btree
$wordTree[]=array($word,1);

if ($wordTree[$WTI][wv]!=$word) exit ("really bad, $word not same as ".$wordTree[$WTI][wv]."'");
if (count($wordTree)!=count($bTree))    exit ("mismatch B at $WTI");
return $WTI;
}

The program has two word Trees. One tree only holds strings that all the characters are letters of the (English) Alphabet. The logic works fine for that tree (so it does not seem to be a coding error). The other tree only holds strings that do not contain any such letters. That tree gets messed up. Sometimes the bTree is not in the proper order and sometimes a duplicate string is added to new Word-Tree. How that escapes array_key_exists test, I have no idea. That is the main question. Auxiliary questions: The "ksort" seems to have a lot of overhead. Any ideas how to use array_slice, array_merge, and the variables $word and $test instead. Do you think having multiple bTrees would help (either one for each first letter of the string or one for each length of the string). That could help find the strings faster.



source https://stackoverflow.com/questions/76104282/getting-strange-results-how-can-i-tell-if-programming-error-or-php-error-ussin

No comments:

Post a Comment