You are not logged in.

Announcement

 Téléchargez la dernière version stable de GLPI      -     Et vous, que pouvez vous faire pour le projet GLPI ? :  Contribuer
 Download last stable version of GLPI                      -     What can you do for GLPI ? :  Contribute

#1 2014-01-27 15:54:54

tomolimo
Member
From: Grenoble, France
Registered: 2009-05-12
Posts: 517

HTML::clean function adds ' ' (spaces) when converting HTML into TXT

Hello,
This is a problem when using HTML emails that are converted into TXT for ticket creation.
In file html.class.php, the clean() function (line 51 and 58):

static function clean($value) {

      $value = preg_replace("/<(p|br)( [^>]*)?".">/i", "\n", $value);

      $specialfilter = array('@<span[^>]*?x-hidden[^>]*?>.*?</span[^>]*?>@si'); // Strip ToolTips
-->     $value = preg_replace($specialfilter, ' ', $value);

      $search = array('@<script[^>]*?>.*?</script[^>]*?>@si', // Strip out javascript
                      '@<style[^>]*?>.*?</style[^>]*?>@si',   // Strip style tags properly
                      '@<[\/\!]*?[^<>]*?>@si',                // Strip out HTML tags
                      '@<![\s\S]*?--[ \t\n\r]*>@');           // Strip multi-line comments including CDATA

 -->     $value = preg_replace($search, ' ', $value);

      $value = preg_replace("/(&nbsp;| )+/", " ", $value);
      // nettoyer l'apostrophe curly qui pose probleme a certains rss-readers, lecteurs de mail...
      $value = str_replace("&#8217;", "'", $value);

   // Problem with this regex : may crash
   //   $value = preg_replace("/ +/u", " ", $value);
      $value = preg_replace("/\n{2,}/", "\n\n", $value,-1);

      return trim($value);
   }

This code leads to HTML tags being replaced by spaces. This is a problem when in a text, there are some formatting tags (like bold, italic, and so on). At the end you get the same text with spaces inside words.

I propose to simply replace the tags by empty strings like following:

static function clean($value) {

      $value = preg_replace("/<(p|br)( [^>]*)?".">/i", "\n", $value);

      $specialfilter = array('@<span[^>]*?x-hidden[^>]*?>.*?</span[^>]*?>@si'); // Strip ToolTips
-->      $value = preg_replace($specialfilter, '', $value);

      $search = array('@<script[^>]*?>.*?</script[^>]*?>@si', // Strip out javascript
                      '@<style[^>]*?>.*?</style[^>]*?>@si',   // Strip style tags properly
                      '@<[\/\!]*?[^<>]*?>@si',                // Strip out HTML tags
                      '@<![\s\S]*?--[ \t\n\r]*>@');           // Strip multi-line comments including CDATA

-->      $value = preg_replace($search, '', $value);

      $value = preg_replace("/(&nbsp;| )+/", " ", $value);
      // nettoyer l'apostrophe curly qui pose probleme a certains rss-readers, lecteurs de mail...
      $value = str_replace("&#8217;", "'", $value);

   // Problem with this regex : may crash
   //   $value = preg_replace("/ +/u", " ", $value);
      $value = preg_replace("/\n{2,}/", "\n\n", $value,-1);

      return trim($value);
   }

This code has been running for 6 months inside our GLPI instance 0.83.8.
could it be integrated into a patch?

regards,
tomolimo

Last edited by tomolimo (2014-01-27 15:55:43)


GLPI 9.5.5 - PHP 7.4 / ProcessMaker 3.3.0-community - PHP 7.1 / Windows 2016 / IIS  / MySQL 5.7
Worldwide: >17k Computers, >17k Users (16 languages, >11 timezones), >610k tickets, >6700 entities, >7600 Groups, >20700 process cases
Raynet is ARaymond (https://www.araymond.com) IT service management

Offline

#2 2014-01-29 11:03:41

MoYo
GLPI - Lead
From: Poitiers
Registered: 2004-09-13
Posts: 14,513
Website

Re: HTML::clean function adds ' ' (spaces) when converting HTML into TXT


MoYo - Julien Dombre - Association INDEPNET
Contribute to GLPI :    Support     Contribute     References     Freshmeat

Offline

#3 2014-01-29 11:25:23

tomolimo
Member
From: Grenoble, France
Registered: 2009-05-12
Posts: 517

Re: HTML::clean function adds ' ' (spaces) when converting HTML into TXT

thank you
regards,
Tomolimo


GLPI 9.5.5 - PHP 7.4 / ProcessMaker 3.3.0-community - PHP 7.1 / Windows 2016 / IIS  / MySQL 5.7
Worldwide: >17k Computers, >17k Users (16 languages, >11 timezones), >610k tickets, >6700 entities, >7600 Groups, >20700 process cases
Raynet is ARaymond (https://www.araymond.com) IT service management

Offline

Board footer

Powered by FluxBB