Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

UNIX filter - remove all codes - Automation

1 view
Skip to first unread message

International Connections

unread,
Sep 18, 2003, 12:40:39 PM9/18/03
to
The UNIX code I listed in 96 can be updated/modified to remove "HTML
code" before translation of HTML files.

For my new visitors using VB the same process can be achieved using VB
Replace Function:
Description:
Returns a string in which a specified substring has been replaced with
another substring a specified number of times.

Can be very useful before a translation from English to French.

Example:

NewString = "My Google email address was idgu...@interserv.com"
MsgBox NewString
NewString = Trim(Replace(NewString, "was idgu...@interserv.com", "is
assi...@contactez.net", 1, 1, 1))
MsgBox NewString

Technical Support & Shareware Author
http://members.tripod.com/Frenchtranslator/
Webmaster & Administrator
Gurley Community & Town History Alabama USA
http://www.contactez.net/gurleyalabama
Southern Belle Provençal Gifts & Home Decoration
http://gift.contactez.net

===========

From: idgu...@interserv.com (idgu...@interserv.com)
Subject: UNIX filter B4 Translation - Votre compagnon
This is the only article in this thread
View: Original Format
Newsgroups: comp.unix.sys5.r3
Date: 1996/04/03

# Asked by many localization coordinators who contacted me, here
#is a free copy of my UNIX Filter to remove all codes and unwanted
# strings in a file before sending it to DPTL/Transcend Automatic
# Translation System.
#
# Visit my page at: http://iquest.com/~btatro/lang.html
#
# Filtre UNIX créé pour eliminer tout code d'un fichier avant le
# traitement en traduction automatique.
#
# Contact: idgu...@interserv.com - Votre compagnon
# For other scripts see my posted messages in this news group.
#
# This is a simplified copy of my filter
# Script begins here:
# sed 's/<[A-Z]*[A-Z]>//g' filename | sed 's/<\/[A-Z]*[A-Z]>//g'
# sed 's/<[A-Z]*[A-Z]>//g'
# sed -e to add more than one sed
#sed -e 's/yourcodes>.*<\/yourcodes//g' -e
's/yourcodes>.*<\/yourcodes//g' -e
's/yourcodes>.*<\/yourcodes//g'
node=`uname -n`

testnode ()
{
if [ $node = your_node ]
then
echo "Invalid option!"
echo "You are on your_node - No DocteurDomi for you -"
sleep 2
else
echo "Réexécuter en donnant un nom de fichier à filtrer valide ! "
/usr/ip32/vt200/vterm/ -xs DocteurDomi -c -f /the path you want/files
fi
}

if [ "$1" = "" ]
then
echo "Il faut entrer un nom de fichier à filtrer"
echo "Exemple: filtrer le_ficher"
testnode
break
else
/bin/ls $1 > /dev/null
if [ "$?" = 0 ]
then
#next: Note that the -e option can be used to limit the number of seds
#Here are some examples of seds:
echo " [33mElimination des lignes contenant what_you_want [0m"
sed '/what_you_want/d' $1 | sed '/what_you_want/d' | sed
'/what_you_want/d' > FICHIER0
echo " [33mElimination des lignes contenant : INDEXTOP - COLITEM -
SECTION
ID [0m"
sed '/what_you_want/d' FICHIER0 | sed '/what_you_want/d' | sed
'/SECTION ID/d' > FICHIER
echo " [33mElimination des lignes contenant : what_you_want [0m"
sed '/what_you_want/d' FICHIER | sed '/what_you_want/d' | sed
'/HELP/d' | sed
'/what_you_want/d' > FICHIER1
echo " [33mElimination des lignes contenant : what_you_want [0m"
sed '/what_you_want/d' FICHIER1 > FICHIER2
echo " [36m what_you_want; remplacé par: what_you_want [0m"
echo " [32mFiltrage de: what_you_want et de tout autre code [0m"
sed -e 's/what_you_want/: /g' -e 's/<[^\>]*>/ /g' FICHIER2 > SORTIE
/usr/bin/pg -p " q pour terminer pagination >>" SORTIE
echo " [36mNettoyer les fichiers temporaires (y/n) ? [0m"
read rep
if [ "$rep" = "y" ]
then
echo " [32mTous les fichiers temporaires sont effacés [0m"
echo "Le fichier temporaire à traduire est: SORTIE"
rm FICHIER*
else
echo " [33mFichier à traduire est: SORTIE [0m"
echo " [36mLes fichiers temporaires ne sont pas effacés [0m"
fi
else
echo " [32mFichier $1 n'existe pas dans répertoire en cours [0m"
testnode
break
fi
fi

0 new messages